D - Puzzling array optimization statements in D spec
- Mark Evans (28/28) Sep 29 2002 The online D specification (see excerpt below) has me puzzled. A standa...
- Walter (17/45) Sep 29 2002 I'm not sure what the problem is - you can choose to either have true
- Mark Evans (5/7) Oct 01 2002 The problem was the documentation's remark about pointers-to-rows being ...
- Mac Reiter (15/23) Oct 01 2002 That kinda depends on how you measure efficiency. If you look at instru...
The online D specification (see excerpt below) has me puzzled. A standard rectangular array in C involves a multiply and add for each element access. The standard optimization trick eliminates the multiply. It sets up an auxiliary array of pointers to each row, thereby replacing an expensive multiply with a cheap pointer dereference. The syntax turns out to be identical to a standard access, array[5][6]. The D spec seems to denigrate this whole idea in favor of the non-optimized rectangular layout. Am I reading it correctly? When did this old C trick lose its validity? Mark ========================== Rectangular Arrays Experienced FORTRAN numerics programmers know that multidimensional "rectangular" arrays for things like matrix operations are much faster than trying to access them via pointers to pointers resulting from "array of pointers to array" semantics. For example, the D syntax: double matrix[][]; declares matrix as an array of pointers to arrays. (Dynamic arrays are implemented as pointers to the array data.) Since the arrays can have varying sizes (being dynamically sized), this is sometimes called "jagged" arrays. Even worse for optimizing the code, the array rows can sometimes point to each other! Fortunately, D static arrays, while using the same syntax, are implemented as a fixed rectangular layout: double matrix[3][3]; declares a rectangular matrix with 3 rows and 3 columns, all contiguously in memory. In other languages, this would be called a multidimensional array and be declared as: double matrix[3,3];
Sep 29 2002
I'm not sure what the problem is - you can choose to either have true rectangular arrays or have arrays of pointers to rows. "Mark Evans" <Mark_member pathlink.com> wrote in message news:an8b7o$2pl7$1 digitaldaemon.com...The online D specification (see excerpt below) has me puzzled. A standard rectangular array in C involves a multiply and add for each elementaccess.The standard optimization trick eliminates the multiply. It sets up an auxiliary array of pointers to each row, thereby replacing an expensivemultiplywith a cheap pointer dereference. The syntax turns out to be identical toastandard access, array[5][6]. The D spec seems to denigrate this whole idea in favor of thenon-optimizedrectangular layout. Am I reading it correctly? When did this old C trickloseits validity? Mark ========================== Rectangular Arrays Experienced FORTRAN numerics programmers know that multidimensional "rectangular" arrays for things like matrix operations are much fasterthantrying to access them via pointers to pointers resulting from "array ofpointersto array" semantics. For example, the D syntax: double matrix[][]; declares matrix as an array of pointers to arrays. (Dynamic arrays are implemented as pointers to the array data.) Since the arrays can havevaryingsizes (being dynamically sized), this is sometimes called "jagged" arrays.Evenworse for optimizing the code, the array rows can sometimes point to eachother!Fortunately, D static arrays, while using the same syntax, are implementedas afixed rectangular layout: double matrix[3][3]; declares a rectangular matrix with 3 rows and 3 columns, all contiguouslyinmemory. In other languages, this would be called a multidimensional arrayand bedeclared as: double matrix[3,3];
Sep 29 2002
The problem was the documentation's remark about pointers-to-rows being less efficient than rectangular arrays. AFAIK rectangular arrays are less efficient to access. Mark In article <an8vo9$ekh$1 digitaldaemon.com>, Walter says...I'm not sure what the problem is - you can choose to either have true rectangular arrays or have arrays of pointers to rows.
Oct 01 2002
That kinda depends on how you measure efficiency. If you look at instruction cycle counts, the two pointer dereference is probably faster. But if you consider the cache effects of having each row in a separate chunk of the heap, then each of those dual pointer lookups is going to jump the CPUs attention to a different place (one for the original pointer, one for the row pointer). This can cause cache thrashing. With CPUs running at 7x the speed of even the fastest RAM out there, you can usually afford to do the multiply and add for less real time than the cache misses from the dual dereference. Also, intelligently optimized use of some of the weirder x86 opcodes, like LEA, can make the multiply/add process pretty fast, even in raw cycle counts. But I would expect the cache coherency issues to overwhelm the CPU cycle count issues. Of course, if anyone with more real-world experience cares to tell me that I'm wrong, that wouldn't surprise me... Mac In article <and1va$24b3$1 digitaldaemon.com>, Mark Evans says...The problem was the documentation's remark about pointers-to-rows being less efficient than rectangular arrays. AFAIK rectangular arrays are less efficient to access. Mark In article <an8vo9$ekh$1 digitaldaemon.com>, Walter says...I'm not sure what the problem is - you can choose to either have true rectangular arrays or have arrays of pointers to rows.
Oct 01 2002