digitalmars.D.learn - Array start index
- DLearner (9/9) Aug 01 2015 Does the D language set in stone that the first element of an
- Rikki Cattermole (22/31) Aug 01 2015 In c style languages (like D) the index actually defines the offset in
- bachmeier (8/17) Aug 01 2015 Seems you could easily wrap an array in a struct, define opIndex
- John Colvin (7/16) Aug 01 2015 See
- DLearner (24/43) Aug 01 2015 D is a C derivative, so it seems a shame not to identify causes
- Andrej Mitrovic via Digitalmars-d-learn (17/20) Aug 01 2015 This has already been done! D defines an array to be a struct with a
- bachmeier (8/30) Aug 01 2015 But what type of programming are you doing? Even after decades of
- QAston (5/12) Aug 02 2015 Adding 1-indexed arrays to the language fixes nothing. Just write
- bachmeier (6/10) Aug 03 2015 Oh, I don't think that's a good idea. It's too confusing to have
- DLearner (8/18) Aug 03 2015 Looks like 0-base is fixed, to avoid problems with existing code.
- Jonathan M Davis via Digitalmars-d-learn (20/40) Aug 03 2015 Almost all programming languages in heavy use at this point in time star...
- QAston (13/21) Aug 04 2015 There're quite a few things stopping this from being added to the
- jmh530 (7/16) Aug 03 2015 I come from matlab and R which start matrices at 1. I used to
- Observer (8/16) Aug 03 2015 This experiment has already been run. Perl used to support a $[
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (31/36) Aug 04 2015 I, too, don't think this is a good idea in general, but I can see
- Bastiaan Veelo (13/15) Feb 06 2017 Careful here, you've got the arguments reversed. The unit test
- Bastiaan Veelo (14/15) Feb 06 2017 Sorry for that misinformation. I should have said that
- pineapple (9/9) Feb 06 2017 One reason for zero-based indexes that isn't "it's what we're all
- Nemanja Boric (3/12) Feb 06 2017 Related:
- bachmeier (5/14) Feb 06 2017 In the scientific programming community, languages like Matlab,
- Bastiaan Veelo (103/103) Feb 06 2017 (There is an honest question in the end, please read on.)
- =?UTF-8?Q?Ali_=c3=87ehreli?= (30/39) Feb 06 2017 struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
- Bastiaan Veelo (115/123) Feb 07 2017 Thank you very much for your time. Sadly this gives me an access
- =?UTF-8?Q?Ali_=c3=87ehreli?= (32/41) Feb 07 2017 You forgot to call that most important function. ;)
- Bastiaan Veelo (9/42) Feb 07 2017 Hah of course. I assumed the name would give it some special
- =?UTF-8?Q?Ali_=c3=87ehreli?= (9/10) Feb 07 2017 Of course we have to and the struct that I wrote is illegal because it
- Bastiaan Veelo (12/22) Feb 07 2017 Gosh am I glad I brought that up, and for people like you hanging
- =?UTF-8?Q?Ali_=c3=87ehreli?= (28/40) Feb 07 2017 Mandatory disclaimer: We can't be sure without testing.
- Bastiaan Veelo (12/12) Feb 08 2017 Wrapup: I am going to go for the original approach of index
- WhatMeForget (7/16) Feb 06 2017 There is a good metaphor in multi-story buildings. Americans
Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.
Aug 01 2015
On 1/08/2015 9:35 p.m., DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.In c style languages (like D) the index actually defines the offset in memory. Not actually the index. So while in some languages 1 is used instead of 0, 0 maps better to the hardware. Think of this byte array: ubyte* myptr = [ |---|-------| | i | value | |---|-------| | 0 | 1 | | 1 | 255 | ]; For 0 start of index: size_t i = 0; assert(myptr[i] == 1); For 1 start of index: size_t i = 1; assert(i != 0); assert(myptr[i-1] == 1); While this is not the complete reason why 0 is chosen, it is something to think about.
Aug 01 2015
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.Seems you could easily wrap an array in a struct, define opIndex and opSlice appropriately, and use alias this to keep the other properties of the array. The problem you'll run into is interacting with other code that assumes zero-based indexing. I thought about setting the first index at 1 for my library that embeds D inside R, since that's how it's done in R, but very quickly realized how confusing it would be.
Aug 01 2015
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero?For the builtin slice types? Yes, set in stone.Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.See https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html As other commenters have/will point out, you can easily define a custom type in D that behaves as you describe, and - Dijkstra notwithstanding - there are valid uses for such things.
Aug 01 2015
On Saturday, 1 August 2015 at 17:55:06 UTC, John Colvin wrote:On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:D is a C derivative, so it seems a shame not to identify causes of bugs in C, and design them out in D. For example, in C difficult to write non-trivial commercial programs without using pointers. Pointer manipulation has a terrible reputation for bugs. But in D, easy to write commercial programs without using pointers. Problem has been designed away. Similarly, off-by-one array bugs are commonplace in C. We should seek to eliminate the source of those bugs, which basically reduces to the issue that programmers find it unnatural to start a count at zero. Whether they _should_ find a zero start unnatural is irrelevant - they just do as an observed fact, so let's change the language so the issue is avoided (designed away). Suggestion: if the codebase for D is considered so large that zero-basing cannot now be changed, why not extend the language to allow for array definitions like 'int[x:y] foo'? And then have a rule that 'int[:y] bar' defines a 1-based array of y elements?Does the D language set in stone that the first element of an array _has_ to be index zero?For the builtin slice types? Yes, set in stone.Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.See https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html As other commenters have/will point out, you can easily define a custom type in D that behaves as you describe, and - Dijkstra notwithstanding - there are valid uses for such things.
Aug 01 2015
On 8/1/15, DLearner via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:D is a C derivative, so it seems a shame not to identify causes of bugs in C, and design them out in D.This has already been done! D defines an array to be a struct with a pointer and a length. See this article: http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625 I would argue it's not "off-by-one" that's causing most issues when dealing with C "arrays", but instead it's in general out-of-bounds issues (whether it's off bye one or off by 50..) since you often don't have the length or could easily use the wrong variable as the length. Think about how much D code would actually have subtle off-by-one errors if D didn't use 0-based indexing like the majority of popular languages use. Any time you would interface with other languages you would have to double, triple-check all your uses of arrays. FWIW at the very beginning I also found it odd that languages use 0-based indexing, but that was before I had any significant programming experience under my belt. By now it's second nature to me to use 0-based indexing.
Aug 01 2015
On Saturday, 1 August 2015 at 19:04:10 UTC, Andrej Mitrovic wrote:On 8/1/15, DLearner via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:But what type of programming are you doing? Even after decades of programming and trying out dozens of languages, zero-based indexing still gets me at times when the arrays I work with represent vectors and matrices. Especially when porting code from other languages that use one-based indexing. One of the nice things about D is that it gives you the tools to easily make the change if you want.D is a C derivative, so it seems a shame not to identify causes of bugs in C, and design them out in D.This has already been done! D defines an array to be a struct with a pointer and a length. See this article: http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625 I would argue it's not "off-by-one" that's causing most issues when dealing with C "arrays", but instead it's in general out-of-bounds issues (whether it's off bye one or off by 50..) since you often don't have the length or could easily use the wrong variable as the length. Think about how much D code would actually have subtle off-by-one errors if D didn't use 0-based indexing like the majority of popular languages use. Any time you would interface with other languages you would have to double, triple-check all your uses of arrays. FWIW at the very beginning I also found it odd that languages use 0-based indexing, but that was before I had any significant programming experience under my belt. By now it's second nature to me to use 0-based indexing.
Aug 01 2015
On Saturday, 1 August 2015 at 23:02:51 UTC, bachmeier wrote:But what type of programming are you doing? Even after decades of programming and trying out dozens of languages, zero-based indexing still gets me at times when the arrays I work with represent vectors and matrices. Especially when porting code from other languages that use one-based indexing. One of the nice things about D is that it gives you the tools to easily make the change if you want.Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.
Aug 02 2015
On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Aug 03 2015
On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:Looks like 0-base is fixed, to avoid problems with existing code. But nothing stops _adding_ to the language by allowing int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y]. Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y]. That way, 1-start achieved, with no conflict with existing code?Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Aug 03 2015
On Monday, August 03, 2015 21:32:03 DLearner via Digitalmars-d-learn wrote:On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:Almost all programming languages in heavy use at this point in time start indexing at 0. It would be highly confusing to almost all programmers out there to have 1-based indexing. In addition, having 0-based indexing actually makes checking against the end of arrays and other random-access ranges easier. You can just check against length without having to do any math. In general, I would expect 1-based indexing to _increase_ the number of off by one errors in code - both because 0-based indexing helps avoid such problems when dealing with the end of the array and more importantly, because almost everyone expects 0-based indexing. You're really barking up the wrong tree if you're trying to get any support for 1-based indexing in D. I doubt that you will see much of anyone who thinks that it's even vaguely a good idea, and there's no way that Walter or Andrei (or probably anyone in the main dev team) who is going to agree that it's even worth considering. I think that the reality of the matter is that if you're going to do much programming - especially if you're going to be professional programmer - you just need to get used to the idea that array indices start at 0. There are a few languages out there where they don't, but they are far from the norm. - Jonathan M DavisOn Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:Looks like 0-base is fixed, to avoid problems with existing code. But nothing stops _adding_ to the language by allowing int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y]. Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y]. That way, 1-start achieved, with no conflict with existing code?Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Aug 03 2015
On Monday, 3 August 2015 at 21:32:05 UTC, DLearner wrote:Looks like 0-base is fixed, to avoid problems with existing code. But nothing stops _adding_ to the language by allowing int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y]. Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y]. That way, 1-start achieved, with no conflict with existing code?There're quite a few things stopping this from being added to the language. 1. People will have to learn this new feature and it's interaction with gazillion of other D features. 2. There would be a redundancy - core language will have 2 array types while one of them can be easily implemented using the other. 3. Devs will have to maintain it - as if they don't have enough things to fix atm. Really, this is so simple to do as a library - just use opIndex, opSlice with a template struct. As a general rule - start asking for language features only when things can't be done without them.
Aug 04 2015
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.I come from matlab and R which start matrices at 1. I used to think it was more natural. However after I started using numpy I now think 0 index is better. Also see http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html Give it a try. You might find you like it.
Aug 03 2015
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].This experiment has already been run. Perl used to support a $[ variable to set the array base. After experience with the confusion and problems that causes, it was finally deprecated and effectively removed from the language. See the end paragraphs of http://perldoc.perl.org/perlvar.html and also http://search.cpan.org/~wolfsage/perl/ext/arybase/arybase.pm for more info.
Aug 03 2015
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1.I, too, don't think this is a good idea in general, but I can see a few use-cases where 1-based indices may be more natural. It's easy to define a wrapper: struct OneBasedArray(T) { T[] _payload; alias _payload this; T opIndex(size_t index) { assert(index > 0); return _payload[index-1]; } void opIndexAssign(U : T)(size_t index, auto ref U value) { assert(index > 0); _payload[index-1] = value; } } unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 4; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 4); } Test with: rdmd -main -unittest xx.d This can of course be easily extended to support other bases than one.
Aug 04 2015
On Tuesday, 4 August 2015 at 08:18:50 UTC, Marc Schütz wrote:void opIndexAssign(U : T)(size_t index, auto ref U value) {Careful here, you've got the arguments reversed. The unit test didn't detect this because it was ambiguous. This one isn't: unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 14; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 14); }
Feb 06 2017
On Monday, 6 February 2017 at 14:26:35 UTC, Bastiaan Veelo wrote:The unit test didn't detect this because it was ambiguous.Sorry for that misinformation. I should have said that opIndexAssign wasn't tested. Here is a better test. unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 4; arr[4] = 14; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 14); }
Feb 06 2017
One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index. Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.
Feb 06 2017
On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index. Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.Related: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html
Feb 06 2017
On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index. Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.In the scientific programming community, languages like Matlab, Fortran, and R use one-based indexing for a very good reason. Zero-based indexing is a computer science thing that makes no sense to someone working with equations.
Feb 06 2017
(There is an honest question in the end, please read on.) All good reasons set aside, both in favour and against 0-based arrays, the only reason that is relevant to me right now is that we are seriously looking into translating close to a million lines of foreign code to D, from a language that supports arrays over arbitrary ranges of indices (positive and negative). Adapting this much program logic to a different array base is out of the question; this is an engineering application and lives are at stake. So let's not bring the base-arguments back into this sub-thread but focus on a performant solution. Expanding on Marc's outset, I now have: /***************** * A fixed-length array with an index that runs from $(D_PARAM first) * to $(D_PARAM last) inclusive. * * Indices are converted, which involves a small overhead. */ struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) { T[last - first + 1] _payload; alias _payload this; // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return _payload[index - first]; } // Support arr[5] = e; void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) { assert(index >= first); assert(index <= last); _payload[index - first] = value; } // Support foreach(e; arr). int opApply(scope int delegate(ref T) dg) { int result = 0; for (int i = 0; i < _payload.length; i++) { result = dg(_payload[i]); if (result) break; } return result; } // Support foreach(i, e; arr). int opApply(scope int delegate(ptrdiff_t index, ref T) dg) { int result = 0; for (int i = 0; i < _payload.length; i++) { result = dg(i + first, _payload[i]); if (result) break; } return result; } // Write to binary file. void toFile(string fileName) { import std.stdio; auto f = File(fileName, "wb"); if (f.tryLock) { f.rawWrite(_payload); } } } unittest { StaticArray!(int, -10, 10) arr; assert(arr.length == 21); foreach (ref e; arr) e = 42; assert(arr[-10] == 42); assert(arr[0] == 42); assert(arr[10] == 42); foreach (i, ref e; arr) e = i; assert(arr[-10] == -10); assert(arr[0] == 0); assert(arr[5] == 5); assert(arr[10] == 10); arr[5] = 15; assert(arr[5] == 15); } ////////////// (The first and last indices probably don't need to be template arguments, they could possibly be immutable members of the struct; But that is not what worries me now.) The thing is that a small overhead is paid in opIndex() and opIndexAssign() (the other member functions are fine). I'd like to get rid of that overhead. In "Numerical Recipes in C", section 1.2, Press et al. propose an easy solution using an offset pointer: float b[4], *bb; bb = b - 1; Thus bb[1] through bb[4] all exist, no space is wasted nor is there a run-time overhead. I have tried to adapt the internals of my struct to Press' approach, but it seems to me it is not that easy in D -- or I'm just not proficient enough. Can someone here show me how that could be done? Thanks!
Feb 06 2017
On 02/06/2017 03:00 PM, Bastiaan Veelo wrote:In "Numerical Recipes in C", section 1.2, Press et al. propose an easy solution using an offset pointer: float b[4], *bb; bb = b - 1; Thus bb[1] through bb[4] all exist, no space is wasted nor is there a run-time overhead. I have tried to adapt the internals of my struct to Press' approach, but it seems to me it is not that easy in D -- or I'm just not proficient enough. Can someone here show me how that could be done?struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) { T[last - first + 1] _payload; If the static array has to be a member like that (as opposed to a dynamic one), you need an extra member: T* _ptr; Because structs cannot have default constructors, you need an init() function to establish the invariant: void init() { this._ptr = _payload.ptr - first; } Of course, that relies on the fact that _payload.ptr - first is a valid address to hold in a pointer. [OT]: // Commented-out as it seems dangerous because _payload does not // have the same semantics as StaticArray. // // alias _payload this; Which means, you would implement length() yourself: size_t length() { return _payload.length; } Then you use _ptr when indexing: // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return *(_ptr + index); } Ali
Feb 06 2017
On Monday, 6 February 2017 at 23:42:55 UTC, Ali Çehreli wrote:Then you use _ptr when indexing: // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return *(_ptr + index); } AliThank you very much for your time. Sadly this gives me an access violation. The traceback doesn't seem right though, as it seems to jump to opIndexAssign from where opIndex is used. If I'm not mistaken, I have confirmed that _ptr is a valid address, see below. We do not need to take measures against the GC? - Bastiaan. On Windows: rdmd -main -unittest -debug -g source\epcompat\array.d object.Error (0): Access Violation ---------------- 0x00402057 in void epcompat.array.__unittestL81_1() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(88) 0x0040465C in void epcompat.array.__modtest() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(39) 0x0040A455 in int core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*) 0x0040C007 in int object.ModuleInfo.opApply(scope int delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*)) 0x00406808 in _d_run_main 0x004046D4 in main at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(7) 0x004225DD in mainCRTStartup 0x779C62C4 in BaseThreadInitThunk 0x77B50FD9 in RtlSubscribeWnfStateChangeNotification 0x77B50FA4 in RtlSubscribeWnfStateChangeNotification The complete source: module epcompat.array; // Test with rdmd -main -unittest -debug -g source\epcompat\array.d alias StaticArray_offset StaticArray; /***************** * A fixed-length array with an index that runs from $(D_PARAM first) * to $(D_PARAM last) inclusive. * * Implemented by means of an offset pointer. */ struct StaticArray_offset(T, ptrdiff_t first, ptrdiff_t last) { T[last - first + 1] _payload; T* _ptr; void init() { assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; } size_t length() { return _payload.length; } // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return *(_ptr + index); } // Support arr[5] = e; void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) { assert(index >= first); assert(index <= last); *(_ptr + index) = value; } // Line 39 // Support foreach(e; arr). int opApply(scope int delegate(ref T) dg) { int result = 0; for (int i = 0; i < _payload.length; i++) { result = dg(_payload[i]); if (result) break; } return result; } // Support foreach(i, e; arr). int opApply(scope int delegate(ptrdiff_t index, ref T) dg) { int result = 0; for (int i = first; i <= last; i++) { result = dg(i, *(_ptr + i)); if (result) break; } return result; } // Write to binary file. void toFile(string fileName) { import std.stdio; auto f = File(fileName, "wb"); if (f.tryLock) { f.rawWrite(_payload); } } } unittest { StaticArray!(int, -10, 10) arr; assert(arr.length == 21); foreach (ref e; arr) e = 42; assert(arr[-10] == 42); // Line 88 assert(arr[0] == 42); assert(arr[10] == 42); foreach (i, ref e; arr) e = i; assert(arr[-10] == -10); assert(arr[0] == 0); assert(arr[5] == 5); assert(arr[10] == 10); arr[5] = 15; assert(arr[5] == 15); }
Feb 07 2017
On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:void init() { assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; }You forgot to call that most important function. ;) 1) I don't understand the first assert there, which does not pass for me, so I commented it out. 2) May bad: init() is not a good name for a struct member, so it should be renamed: void initialize() { // assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; } 3) Instead of having to remember to call it, let's introduce a function that does it for us: auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() { auto s = StaticArray!(T, first, last)(); s.initialize(); return s; } unittest { // StaticArray!(int, -10, 10) arr; auto arr = makeStaticArray!(int, -10, 10);foreach (i, ref e; arr) e = i;Unrelated: That line passes because you're building 32-bits. Here is the error I got: Error: cannot implicitly convert expression (i) of type long to int You can cast it: e = cast(int)i; or by import std.conv : to; e = i.to!int; Ali
Feb 07 2017
On Tuesday, 7 February 2017 at 20:28:30 UTC, Ali Çehreli wrote:You forgot to call that most important function. ;)Hah of course. I assumed the name would give it some special meaning, like postblit.1) I don't understand the first assert there, which does not pass for me, so I commented it out.The intention is to check that (_payload.ptr - first) is larger than 0.2) May bad: init() is not a good name for a struct member, so it should be renamed: void initialize() { // assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; } 3) Instead of having to remember to call it, let's introduce a function that does it for us: auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() { auto s = StaticArray!(T, first, last)(); s.initialize(); return s; }OK good.unittest { // StaticArray!(int, -10, 10) arr; auto arr = makeStaticArray!(int, -10, 10);Thanks a lot for your illustrative answers, including the next one! Bastiaan.foreach (i, ref e; arr) e = i;Unrelated: That line passes because you're building 32-bits. Here is the error I got: Error: cannot implicitly convert expression (i) of type long to int You can cast it: e = cast(int)i; or by import std.conv : to; e = i.to!int;
Feb 07 2017
Pressed send too soon, before considering your GC question. On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:We do not need to take measures against the GC?Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.) This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique. Ali
Feb 07 2017
On Tuesday, 7 February 2017 at 20:33:35 UTC, Ali Çehreli wrote:On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:Gosh am I glad I brought that up, and for people like you hanging out in the learn group. Thanks for pointing this out!We do not need to take measures against the GC?Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.)This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique.You mean slower than the _payload[index - first] technique? Is that because of heap memory versus stack memory? Am I correct that the reason that a dynamic array would work is because it is allocated independently from the struct, meaning that its address stays the same even if the struct is moved? We could .reserve a dynamic array in initialize() to prevent it being resized. Thanks again. Bastiaan.
Feb 07 2017
On 02/07/2017 02:04 PM, Bastiaan Veelo wrote:Mandatory disclaimer: We can't be sure without testing. Not exactly because stack versus heap because the whole object might be sitting in heap memory anyway: alias S = StaticArray!(/* ... */); S[] objects; objects ~= S(/* ... */); So, all of those are on the heap. It's more about having everything at hand, near each other, close in memory. If the member is a static array, then when we have an S[], all members are there to be operated on. If the array in dynamic, then an S[] has indirections, reaching out to the heap, which may involve long-latency memory reads. foreach (s; objects) { s[42]; // May have a cache miss and read from memory } In the static array case, the entire body of s is already on the cache. On the other hand, when the objects are large, then few of those can fit in the cache. etc. One needs to profile to see what works better.This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique.You mean slower than the _payload[index - first] technique? Is that because of heap memory versus stack memory?Am I correct that the reason that a dynamic array would work is because it is allocated independently from the struct, meaning that its address stays the same even if the struct is moved?Yes, that would be a requirement to prevent the self-reference.We could .reserve a dynamic array in initialize() to prevent it being resized.Makes sense. Another solution that came to my mind: You can keep the self-referencing solution but make sure that the objects are not moved after initialize(): S[] objects; objects.reserve(100); objects.each!(o => o.initialize()); // The objects will not move if you're carefulThanks again. Bastiaan.Ali
Feb 07 2017
Wrapup: I am going to go for the original approach of index conversion, and leaving the offset-pointer approach for what it is. Reasons: 1) uncertain efficiency gain/loss, 2) theoretically it may fail, 3) .sizeof does not include the payload, 4) analysis of the assembler generated by our reference compiler (Prospero) shows that it also uses index conversion. (Interestingly, it does so at compile time, when possible). Thanks for many interesting insights. Bastiaan.
Feb 08 2017
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:Does the D language set in stone that the first element of an array _has_ to be index zero? Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does seem more natural to begin a count at 1. Actually, maybe even better to allow array definitions of form int foo[x:y]; (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y]. I think the (very old) IBM PL/I language was like this.There is a good metaphor in multi-story buildings. Americans the UK, their first floor is what Americans would call the 2nd floor. But then how can you trust a group of people who drive on the wrong side of the road :)
Feb 06 2017