digitalmars.D - Do we need Mat, Vec, TMmat, Diag, Sym and other matrix types?
- 9il (34/34) Mar 12 2018 Hi All,
- jmh530 (28/38) Mar 12 2018 I'm not 100% sold on matrix[i] returning a Vec instead of a
- 9il (10/40) Mar 13 2018 TMat is a transposed matrix. Not sure for now if it would be
- jmh530 (4/10) Mar 13 2018 There are some people who like being able to specify a whether a
- Ilya Yaroshenko (5/17) Mar 13 2018 Good point. Would matrix(j, i) syntax solve this issue? One of
- jmh530 (17/35) Mar 13 2018 I'm not sure I understand what your syntax solution does...
- Martin Tschierschke (8/9) Mar 13 2018 I think for mathematics it is more important for easy handling,
- jmh530 (25/34) Mar 13 2018 The underlying storage format is important for performance,
- 9il (2/6) Mar 13 2018 Maybe we should use only column major order. --Ilya
- jmh530 (8/9) Mar 14 2018 In my head I had been thinking that the Mat type you want to
- 9il (3/9) Mar 14 2018 Nope, I mean completely new type, that will hold Contiguous 2D
- Andrei Alexandrescu (15/22) Mar 14 2018 Has row-major fallen into disuse?
- jmh530 (10/12) Mar 14 2018 C has always been row major and is not in disuse (the GSL library
- Sam Potter (37/39) Mar 14 2018 Leaving aside interop between libraries and domains where row
- bachmeier (8/16) Mar 14 2018 I sure hope not. At least not for a long time anyway. It would be
- Sam Potter (30/46) Mar 14 2018 Sure. The key word in my statement was "ideally". :-)
- jmh530 (23/36) Mar 14 2018 libmir [1] originally started as std.experiemental.ndslice (that
- Sam Potter (16/55) Mar 14 2018 Thanks for the information, and great points made.
- jmh530 (9/12) Mar 14 2018 There is one in dlang-community (there might be others, can't
- Manu (7/18) Mar 14 2018 FWIW, I've never wanted row-major in my career; almost all linear
- 9il (88/112) Mar 14 2018 Already done: Slice [1]
- jmh530 (3/20) Mar 15 2018 It looks like it should expand the alias earlier. No problem with
- jmh530 (9/12) Mar 15 2018 Also, this issue also shows up in mir.ndslice.traits. I had to do
- 9il (13/26) Mar 18 2018 This does not help in practice because ndslice has an alias this
- jmh530 (5/22) Mar 15 2018 This is issue 16486 [1], which is very similar to 16465 [1] and
- 9il (3/9) Mar 14 2018 mir-lapack works with Canonical matrixes, but it is assumed that
- jmh530 (41/45) Mar 13 2018 I didn't really address this in my other post.
- 9il (5/24) Mar 13 2018 Interesting. Maybe user can do:
- jmh530 (25/30) Mar 13 2018 Hopefully, I made the issue more clean in my response to Martin.
- J-S Caux (20/56) Mar 12 2018 Well if D had:
- 9il (3/8) Mar 13 2018 auto row = matrix[i]; // or matrix[i, 0 .. $];
- jmh530 (4/13) Mar 13 2018 Some kind of improvement that replaces 0 .. $ with some shorter
- bachmeier (11/25) Mar 13 2018 What I have settled on is Row(x,2), which returns a range that
- jmh530 (4/13) Mar 13 2018 Some kind of improvement that replaces 0 .. $ with some shorter
- jmh530 (2/5) Mar 13 2018 Sorry for double post.
- Martin Tschierschke (20/24) Mar 13 2018 [...]
- Manu (12/41) Mar 13 2018 I'd like to understand why implement a distinct vector type, rather
- 9il (9/22) Mar 13 2018 This is just and API quesiton of how elements of Nx1/1xN matrix
- Paul O'Neil (4/11) Mar 14 2018 Armadillo (arma.sourceforge.net) has separate column and row vector
- =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Mar 20 2018 Sounds great. In which repo will these changes happen?
- 9il (2/6) Sep 10 2018 It is planned to be in mir-algorithm, though
Hi All, The Dlang multidimensional range type, ndslice, is a struct composed a an iterator, lengths and possibly strides. It does not own memory and does not know anything about its content. ndslice is a faster and extended version of numpy.ndarray. After some work on commercial projects based on Lubeck[1] and ndslice I figure out what API and memory management is required to make Dlang super fast and math friendly in the same time. The concept is the following: 1. All memory is managed by a global BetterC thread safe ARC allocator. Optionally the allocator can be overloaded. 2. User can take an internal ndslice to use mir.ndslice API internally in functions. 2. auto matrixB = matrixA; // increase ARC 3. auto matrixB = matrixA.dup; // allocates new matrix 4. matrix[i] returns a Vec and increase ARC, matrix[i, j] returns a content of the cell. 5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C; `Mat` and other types will support any numeric types, PODlike structs, plus special overload for `bool` based on `bitwise` [2]. I have a lot of work for next months, but looking for a good opportunity to make Mat happen. For contributing or co-financing: Ilya Yaroshenko at gmail com Best Regards, Ilya [1] https://github.com/kaleidicassociates/lubeck [2] http://docs.algorithm.dlang.io/latest/mir_ndslice_topology.html#bitwise [3] http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html#gaeda3cbd99c8fb834a60a6412878226e1
Mar 12 2018
On Tuesday, 13 March 2018 at 03:37:36 UTC, 9il wrote:[snip] 4. matrix[i] returns a Vec and increase ARC, matrix[i, j] returns a content of the cell.I'm not 100% sold on matrix[i] returning a Vec instead of a 1-dimensional matrix. R does something similar and you have to convert things back to a matrix for some computations more often than I'd like. If functions can easily take both Mat and Vec types in a relatively painless fashion, then I wouldn't have an issue with it.5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C;You might want to explain this in more detail. I saw expression and my head went to expression templates, but that doesn't seem to be what you're talking about (overloading opAssign?)I have a lot of work for next months, but looking for a good opportunity to make Mat happen.+1 With respect to the title, the benefit of special matrix types is when we can call functions (lapack or otherwise) that are optimized for those types. If you want the best performance for mir, then I think that's what it would take. I'm not sure how much you've thought about this. For instance, I understand from graphics libraries that if you're only working with a particular size matrix (say 3x3), then you can generate faster code than if you're working with general matrices. In addition, performance is not the only thing a new user to mir would care about They likely would also care about ease-of-use [1] and documentation. Hopefully these continue to improve. What's TMMat? Diag seems like it would be a special case of sparse matrices, though diag is probably simpler to implement. [1] Would it be seamless to add a Mat to a Diag? Also what happens to the api when you add 10 different matrix types and need to think about all the interactions.
Mar 12 2018
On Tuesday, 13 March 2018 at 04:35:53 UTC, jmh530 wrote:Expression templates plus overloading opAssign.5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C;You might want to explain this in more detail. I saw expression and my head went to expression templates, but that doesn't seem to be what you're talking about (overloading opAssign?)TMat is a transposed matrix. Not sure for now if it would be required.I have a lot of work for next months, but looking for a good opportunity to make Mat happen.+1 With respect to the title, the benefit of special matrix types is when we can call functions (lapack or otherwise) that are optimized for those types. If you want the best performance for mir, then I think that's what it would take. I'm not sure how much you've thought about this. For instance, I understand from graphics libraries that if you're only working with a particular size matrix (say 3x3), then you can generate faster code than if you're working with general matrices. In addition, performance is not the only thing a new user to mir would care about They likely would also care about ease-of-use [1] and documentation. Hopefully these continue to improve. What's TMMat?Diag seems like it would be a special case of sparse matrices, though diag is probably simpler to implement.Diag should be an independent type. But not in the first release.[1] Would it be seamless to add a Mat to a Diag?Mat C = M + D; // D - diagonal. D += M; // Fails at compile time, but D += M.diag; // passAlso what happens to the api when you add 10 different matrix types and need to think about all the interactions.User API would be simple. But it requires clever compile and runtime logic to do the best with expression templates.
Mar 13 2018
On Tuesday, 13 March 2018 at 10:35:15 UTC, 9il wrote:On Tuesday, 13 March 2018 at 04:35:53 UTC, jmh530 wrote:There are some people who like being able to specify a whether a matrix has column or row layout. Would an option to control this be the same thing?[snip] What's TMMat?TMat is a transposed matrix. Not sure for now if it would be required.
Mar 13 2018
On Tuesday, 13 March 2018 at 12:23:23 UTC, jmh530 wrote:On Tuesday, 13 March 2018 at 10:35:15 UTC, 9il wrote:Good point. Would matrix(j, i) syntax solve this issue? One of reasons to introduce Mat is API simplicity. ndslice has 3 compile time params. I hope we would have only type for Mat, like Mat!double.On Tuesday, 13 March 2018 at 04:35:53 UTC, jmh530 wrote:There are some people who like being able to specify a whether a matrix has column or row layout. Would an option to control this be the same thing?[snip] What's TMMat?TMat is a transposed matrix. Not sure for now if it would be required.
Mar 13 2018
On Tuesday, 13 March 2018 at 13:02:45 UTC, Ilya Yaroshenko wrote:On Tuesday, 13 March 2018 at 12:23:23 UTC, jmh530 wrote:I'm not sure I understand what your syntax solution does... But I agree that there is a benefit from API simplicity. It would probably be easier to just say Mat is row-major and have another that is column-major (or have the options in ndslice). Nevertheless, it can't help to look at what other matrix libraries do. Eigen's (C++ library) Matrix class uses template arguments to set storage order (_Options). It looks like Eigen has six template arguments. https://eigen.tuxfamily.org/dox/classEigen_1_1Matrix.html Numpy does the same thing at run-time https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html Also, many of the languages that emphasize linear algebra strongly (Fortran, Matlab, etc) use column-major order. Row-order is most popular coming from C-based languages. https://en.wikipedia.org/wiki/Row-_and_column-major_orderOn Tuesday, 13 March 2018 at 10:35:15 UTC, 9il wrote:Good point. Would matrix(j, i) syntax solve this issue? One of reasons to introduce Mat is API simplicity. ndslice has 3 compile time params. I hope we would have only type for Mat, like Mat!double.On Tuesday, 13 March 2018 at 04:35:53 UTC, jmh530 wrote:There are some people who like being able to specify a whether a matrix has column or row layout. Would an option to control this be the same thing?[snip] What's TMMat?TMat is a transposed matrix. Not sure for now if it would be required.
Mar 13 2018
On Tuesday, 13 March 2018 at 14:13:02 UTC, jmh530 wrote: [...]https://en.wikipedia.org/wiki/Row-_and_column-major_orderI think for mathematics it is more important for easy handling, to be able to get the element of a matrix a_ij by a(i,j) and not only by a[i-1,j-1]. The underlying storage concept is less important and depends just on the used libs, which should be the best (fastest) available for the purpose.
Mar 13 2018
On Tuesday, 13 March 2018 at 15:47:36 UTC, Martin Tschierschke wrote:On Tuesday, 13 March 2018 at 14:13:02 UTC, jmh530 wrote: [...]The underlying storage format is important for performance, especially cache lines. For instance, calculate the sum of the columns of a matrix stored in row major format vs. column major format. If it is stored column-wise, you can just loop right down the column. In LAPACKE (note the E), the first parameter on just about every function is a variable controlling whether it is in row-major or column major. By contrast, I believe the original LAPACK (without E) was written for FORTRAN and is in column-major storage [1]. The documentation for LAPACKE notes: "Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine." It also is relevant if people who use mir want to interact with libraries that use different memory layouts. Alternately, people who use languages like Fortran that have row major formats might want to call mir code. [1] mir-lapack uses Canonical slices in many of these functions. I assume this is correct, but I have a nagging feeling that I should compare the results of some of these functions with another language to really convince myself...When you increment an iterator on canonical it's still going in row order.https://en.wikipedia.org/wiki/Row-_and_column-major_orderI think for mathematics it is more important for easy handling, to be able to get the element of a matrix a_ij by a(i,j) and not only by a[i-1,j-1]. The underlying storage concept is less important and depends just on the used libs, which should be the best (fastest) available for the purpose.
Mar 13 2018
On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:"Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."Maybe we should use only column major order. --Ilya
Mar 13 2018
On Wednesday, 14 March 2018 at 05:01:38 UTC, 9il wrote:Maybe we should use only column major order. --IlyaIn my head I had been thinking that the Mat type you want to introduce would be just an alias to a 2-dimensional Slice with a particular SliceKind and iterator. Am I right on that? If that's the case, why not introduce a Tensor type that corresponds to Slice but in column-major storage and then have Mat (or Matrix) be an alias to 2-dimensional Tensor with a particular SliceKind and iterator.
Mar 14 2018
On Wednesday, 14 March 2018 at 12:45:24 UTC, jmh530 wrote:On Wednesday, 14 March 2018 at 05:01:38 UTC, 9il wrote:Nope, I mean completely new type, that will hold Contiguous 2D Slice as internal payload.Maybe we should use only column major order. --IlyaIn my head I had been thinking that the Mat type you want to introduce would be just an alias to a 2-dimensional Slice with a particular SliceKind and iterator. Am I right on that?
Mar 14 2018
On 03/14/2018 01:01 AM, 9il wrote:On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:Has row-major fallen into disuse? Generally: it would be great to have a standard collection of the typical data formats used in linear algebra and scientific coding. This would allow interoperation without having each library define its own types with identical layout but different names. I'm thinking of: * multidimensional hyperrectangular * multidimensional jagged * multidimensional hypertriangular if some libraries use it * sparse vector (whatever formats are most common, I assume array of pairs integral/floating point number, with the integral either before or after the floating point number) No need for a heavy interface on top of these. These structures would be low-maintenance and facilitate a common data language for libraries. Andrei"Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."Maybe we should use only column major order. --Ilya
Mar 14 2018
On Wednesday, 14 March 2018 at 16:16:55 UTC, Andrei Alexandrescu wrote:Has row-major fallen into disuse? [snip]C has always been row major and is not in disuse (the GSL library has gsl_matrix and that is row-major). However, Fortran and many linear algebra languages/frameworks have also always been column major. Some libraries allow both. I use Stan for Bayesian statistics. It has an array type and a matrix type. Arrays are row-major and matrices are column major. (Now that I think on it, an array of matrices would probably resolve most of my needs for an N-dimensional column major type)
Mar 14 2018
On Wednesday, 14 March 2018 at 16:16:55 UTC, Andrei Alexandrescu wrote:Leaving aside interop between libraries and domains where row major is often used (e.g. computer graphics), the issue is semantic, I think. AFAIK, basically all "modern" domains which involve linear algebra ("data science", machine learning, scientific computing, statistics, etc.) at least formulate their models using the standard linear algebra convention, which is to treat vectors as column vectors (column major). It seems, at least tacitly, that this question is partly about whether to treat this use case as a first-class citizen---column vectors are the norm in these areas. It's great to have flexibility at the low level to deal with whatever contingency may arise, but it seems that it might be worth taking a page out of MATLAB/numpy/Julia's book and to try to make the common case as easy as possible (what you were alluding to with a "common data language", I think). This seems to match the D notion of having "high torque". Start with svd(X) and ramp up as necessary. Along these lines, including a zoo of different types of hypermatrices (triangular, banded, sparse, whatever) is cool, but the common cases for plain ol' matrices are dense, sparse, diagonal, triangular, Toeplitz, etc. Ideally data structures and algorithms covering this would be in the standard library? Also, I think Armadillo is an extremely nice library, but even it can be a little frustrating and clunky at times. Another case study: the Python scientific toolchain (numpy/scipy/matplotlib) seems to be going in the direction of deprecating "ipython --pylab" (basically start ipython in a MATLAB compatibility mode so that typing typical MATLAB commands "just works"). This seems to me to be a huge mistake---the high-level "scripting mode" provided by "ipython --pylab" is extremely valuable. This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)Maybe we should use only column major order. --IlyaHas row-major fallen into disuse?
Mar 14 2018
On Wednesday, 14 March 2018 at 17:22:16 UTC, Sam Potter wrote:Ideally data structures and algorithms covering this would be in the standard library?I sure hope not. At least not for a long time anyway. It would be hard to make any progress if it were in the standard library. At this stage functionality is more important than having a tiny amount of code that is written perfectly and has satisfied the 824 rules necessary to get into Phobos.This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)What has been holding you back from using D? I've experienced no difficulties, but maybe I've been lucky.
Mar 14 2018
On Wednesday, 14 March 2018 at 17:36:18 UTC, bachmeier wrote:On Wednesday, 14 March 2018 at 17:22:16 UTC, Sam Potter wrote:Sure. The key word in my statement was "ideally". :-) For what it's worth, there is already an "informal spec" in the form of the high-level interface for numerical linear algebra and sci. comp. that has been developed (over three decades?) in MATLAB. This spec has been replicated (more or less) in Julia, Python, Octave, Armadillo/Eigen, and others. I'm not aware of all the subtleties involved in incorporating it into any standard library, let alone D's, but maybe this is an interesting place where D could get an edge over other competing languages. Considering that people in Python land have picked up D as a "faster Python", there might be more traction here than is readily apparent. Just some thoughts---I'm very biased. The reality is that these algorithms are equally (if not more) important in their domain than the usual "CS undergrad algorithms". It's easy enough to use a library, but again, "high torque" would seem to point to making it easier.Ideally data structures and algorithms covering this would be in the standard library?I sure hope not. At least not for a long time anyway. It would be hard to make any progress if it were in the standard library. At this stage functionality is more important than having a tiny amount of code that is written perfectly and has satisfied the 824 rules necessary to get into Phobos.Hard to say. Right now I write numeric C libraries using C++ "as a backend". I've used D for some very small projects, and my feeling is that I'm using C++ as a worse D. I lean heavily on C++'s templates and metaprogramming abilities, and it looks like I could do what I do now in C++ more easily in D, and could do many more things besides. I think what's holding me back now is a combination of my own inertia (I have to get things done!) and possibly a degree of perceived "instability" (it's totally unclear to me how valid this perception is). I haven't tried "BetterC" yet, but once I have some downtime, I'm going to give it a whirl and see how easily it hooks up with my target languages. If it's a nice experience, I would very much like to try a larger project.This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)What has been holding you back from using D? I've experienced no difficulties, but maybe I've been lucky.
Mar 14 2018
On Wednesday, 14 March 2018 at 20:21:15 UTC, Sam Potter wrote:Sure. The key word in my statement was "ideally". :-) For what it's worth, there is already an "informal spec" in the form of the high-level interface for numerical linear algebra and sci. comp. that has been developed (over three decades?) in MATLAB. This spec has been replicated (more or less) in Julia, Python, Octave, Armadillo/Eigen, and others. I'm not aware of all the subtleties involved in incorporating it into any standard library, let alone D's, but maybe this is an interesting place where D could get an edge over other competing languages. Considering that people in Python land have picked up D as a "faster Python", there might be more traction here than is readily apparent [snip]libmir [1] originally started as std.experiemental.ndslice (that component is now mir-algorithm). They had removed it from the std.experimental because it wasn't stable enough yet and needed to make breaking changes. I think it's doing just fine as a standalone library, rather than part of the standard library. As this thread makes clear, there's certainly more work to be done on it, but I'm sure Ilya would appreciate any feedback or assistance. I'm sympathetic to your point about D getting an edge by having a better linear algebra experience. I came to D for faster Python/R/Matlab (and not C++), though if I need to do something quickly, I still defer to Python/R. However, if you look at the TIOBE index, R and Matlab are at 18/20. Python is quite a bit higher, but it's growth in popularity was not largely due to the Numpy/Scipy ecosystem. So while I think that D could get more traction if libmir turns itself into a premiere linear algebra library, we should be realistic that linear algebra is a relatively small segment of how people use programming languages. Maybe these firms might be willing to pay up for more support though...(if a user could replace pandas with libmir, I would imagine some financial firms might be interested). [1] https://github.com/libmir
Mar 14 2018
On Wednesday, 14 March 2018 at 22:17:49 UTC, jmh530 wrote:On Wednesday, 14 March 2018 at 20:21:15 UTC, Sam Potter wrote:Thanks for the information, and great points made. I think it's worth mentioning Julia's rapid development. I'm not sure exactly how much traction Julia is getting in different sectors, but I've heard that it has a certain appeal by dint of it being open source (no need to buy expensive MATLAB seats). Julia looks like it could kill MATLAB. I think D has no chance of competing here (beyond being a backend for libraries) without a better scientific experience. OTOH, the fact that D doesn't have a REPL may kill it from the get go (hard to do exploratory data analysis). As you said, you still defer to Python/R/etc---I have a hard time seeing myself using a language like D for everything I work on. This is getting pretty derailed from the main subject, sorry! I would like to let Ilya and others continue the original discussion now.Sure. The key word in my statement was "ideally". :-) For what it's worth, there is already an "informal spec" in the form of the high-level interface for numerical linear algebra and sci. comp. that has been developed (over three decades?) in MATLAB. This spec has been replicated (more or less) in Julia, Python, Octave, Armadillo/Eigen, and others. I'm not aware of all the subtleties involved in incorporating it into any standard library, let alone D's, but maybe this is an interesting place where D could get an edge over other competing languages. Considering that people in Python land have picked up D as a "faster Python", there might be more traction here than is readily apparent [snip]libmir [1] originally started as std.experiemental.ndslice (that component is now mir-algorithm). They had removed it from the std.experimental because it wasn't stable enough yet and needed to make breaking changes. I think it's doing just fine as a standalone library, rather than part of the standard library. As this thread makes clear, there's certainly more work to be done on it, but I'm sure Ilya would appreciate any feedback or assistance. I'm sympathetic to your point about D getting an edge by having a better linear algebra experience. I came to D for faster Python/R/Matlab (and not C++), though if I need to do something quickly, I still defer to Python/R. However, if you look at the TIOBE index, R and Matlab are at 18/20. Python is quite a bit higher, but it's growth in popularity was not largely due to the Numpy/Scipy ecosystem. So while I think that D could get more traction if libmir turns itself into a premiere linear algebra library, we should be realistic that linear algebra is a relatively small segment of how people use programming languages. Maybe these firms might be willing to pay up for more support though...(if a user could replace pandas with libmir, I would imagine some financial firms might be interested). [1] https://github.com/libmir
Mar 14 2018
On Wednesday, 14 March 2018 at 23:30:55 UTC, Sam Potter wrote:[snip] OTOH, the fact that D doesn't have a REPL may kill it from the get go (hard to do exploratory data analysis).There is one in dlang-community (there might be others, can't recall), but it does not yet support Windows and there isn't much documentation. I agree it would be nice to have a D Jupyter kernel that was easy to use. Also, I've been playing with run.dlang.io lately and while it's not the same thing as an REPL, it's one of the easiest ways to play around with D (and a select few libraries, including mir-algorithm).
Mar 14 2018
On 14 March 2018 at 09:16, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 03/14/2018 01:01 AM, 9il wrote:FWIW, I've never wanted row-major in my career; almost all linear algebra for geometric use. Columns correspond to spatial axis, and that's almost always what you want to manipulate... so column-major it is. Or with colour transformation, columns correspond to RGB primaries; same story.On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:Has row-major fallen into disuse?"Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."Maybe we should use only column major order. --Ilya
Mar 14 2018
On Wednesday, 14 March 2018 at 16:16:55 UTC, Andrei Alexandrescu wrote:On 03/14/2018 01:01 AM, 9il wrote:Already done: Slice [1]On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:Has row-major fallen into disuse? Generally: it would be great to have a standard collection of the typical data formats used in linear algebra and scientific coding. This would allow interoperation without having each library define its own types with identical layout but different names. I'm thinking of: * multidimensional hyperrectangular"Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."Maybe we should use only column major order. --Ilya* multidimensional jaggedDone as N-1 Slice composed of jagged rows: a) Naive: Slice!(Contiguous, [1], Slice!(Contiguous, [1], double*)*) or b) Intermidiate: Slice!(kind, packs, SubSliceIterator!(Iterator, Slicable)) by [2] or c) Proper, with compressed indexes by [3]: Slice!(Contiguous, [1], SubSliceIteratorInst!I), where SubSliceIteratorInst = SubSliceIterator!(SlideIterator!(size_t*, 2, staticArray), I*); The c) variant is already used by mir.graph to represent Graph indexes.* multidimensional hypertriangular if some libraries use itI saw only 2D packed triangular matrixes in Lapack. Ndslice provide it using `stairs` topology [4]. The types definitions are Slice!(Contiguous, [1], StairsIterator!(T*)) and Slice!(Contiguous, [1], RetroIterator!(MapIterator!(StairsIterator!(RetroIterator!(T*)), retro))) The last one can be simplified though. This types are used in mir-lapack [5], which is a Lapack wrapper with ndslice API created for Lubeck.* sparse vector (whatever formats are most common, I assume array of pairs integral/floating point number, with the integral either before or after the floating point number)Done. Multidimensional Dictionary of Keys (DOK) format is provided by Sparse (an alias for Slice) by [6]. Multidimensional Compressed Sparse Rows (CSR) forma is provided by CompressedTensor [7] There is already sprase BLAS routines for CompressedTensor, [8].No need for a heavy interface on top of these. These structures would be low-maintenance and facilitate a common data language for libraries.As you can see ndslice already provide common data language. The next goal is to provide a high level common data language with memory automated management and Matlab like API. BTW, could you please help with the following issue?! struct S(int b, T) { } alias V(T) = S!(1, T); auto foo (T)(V!T v) { } void main() { V!double v; foo(v); } Error: template onlineapp.foo cannot deduce function from argument types !()(S!(1, double)), candidates are: onlineapp.d(7): onlineapp.foo(T)(V!T v) I mean I need this kind of code compiles. Currently I use bad workarounds or define huge types like: Slice!(Contiguous, [1], StairsIterator!(T*)) Slice!(Contiguous, [1], RetroIterator!(MapIterator!(StairsIterator!(RetroIterator!(T*)), retro))) instead of StairsDown!T and StairsUp!T Of course it is not a problem to add workaround for a standalone project. But for a open source library it is a huge pain (for example, see mir-lapack source with the two types above). ndslice is very powerful when you need to construct a new type. But there is a problem that a type definition very often is too complex. So, an engineer should choose ether to use a complex generic API or provide a simple non generic. A simple generic API is required for Dlang success in math. Please let me know you thought if the issue can be fixed.AndreiBest regards, Ilya [1] [2] [3] http://docs.algorithm.dlang.io/latest/mir_ndslice_topology.html#pairwiseMapSubSlices [4] [5] https://github.com/libmir/mir-lapack [7] [8] http://docs.mir.dlang.io/latest/mir_sparse_blas_gemm.html
Mar 14 2018
On Thursday, 15 March 2018 at 05:04:42 UTC, 9il wrote:[snip] BTW, could you please help with the following issue?! struct S(int b, T) { } alias V(T) = S!(1, T); auto foo (T)(V!T v) { } void main() { V!double v; foo(v); } Error: template onlineapp.foo cannot deduce function from argument types !()(S!(1, double)), candidates are: onlineapp.d(7): onlineapp.foo(T)(V!T v)It looks like it should expand the alias earlier. No problem with auto foo (T)(S!(1, T) v) {};
Mar 15 2018
On Thursday, 15 March 2018 at 12:49:22 UTC, jmh530 wrote:[snip] It looks like it should expand the alias earlier. No problem with auto foo (T)(S!(1, T) v) {};Also, this issue also shows up in mir.ndslice.traits. I had to do the equivalent of isV below. It doesn't work to do the alternate version. However, given that you have the traits, then you can use them in a template constraint. So you have to repeat yourself in the trait once, rather than bunches of times in each function that calls them. enum bool isV(T) = is(T : S!(1, U), U); enum bool isV_alternate(T) = is(T : V!(U), U);
Mar 15 2018
On Thursday, 15 March 2018 at 14:13:25 UTC, jmh530 wrote:On Thursday, 15 March 2018 at 12:49:22 UTC, jmh530 wrote:This does not help in practice because ndslice has an alias this primitive to be implicitly convertible to const version. For example for array one can write: auto foo(T)(const(T)[] ar) {} And this would work with double[] and immutable(double)[] as well. The same true for ndslice: auto foo(T)(Slice!(Contiguous, [1], const(T)*)[] ar) {} It is very important to do not create additional institutions for numeric code because usually it is quite heavy. I don't know DMD internals. How the aliasing issue can be solved? Best regards, Ilya[snip] It looks like it should expand the alias earlier. No problem with auto foo (T)(S!(1, T) v) {};Also, this issue also shows up in mir.ndslice.traits. I had to do the equivalent of isV below. It doesn't work to do the alternate version. However, given that you have the traits, then you can use them in a template constraint. So you have to repeat yourself in the trait once, rather than bunches of times in each function that calls them. enum bool isV(T) = is(T : S!(1, U), U); enum bool isV_alternate(T) = is(T : V!(U), U);
Mar 18 2018
On Thursday, 15 March 2018 at 05:04:42 UTC, 9il wrote:[snip] BTW, could you please help with the following issue?! struct S(int b, T) { } alias V(T) = S!(1, T); auto foo (T)(V!T v) { } void main() { V!double v; foo(v); } Error: template onlineapp.foo cannot deduce function from argument types !()(S!(1, double)), candidates are: onlineapp.d(7): onlineapp.foo(T)(V!T v)This is issue 16486 [1], which is very similar to 16465 [1] and seems like should be closed. [1] https://issues.dlang.org/show_bug.cgi?id=16486 [2] https://issues.dlang.org/show_bug.cgi?id=16465
Mar 15 2018
On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:[1] mir-lapack uses Canonical slices in many of these functions. I assume this is correct, but I have a nagging feeling that I should compare the results of some of these functions with another language to really convince myself...When you increment an iterator on canonical it's still going in row order.mir-lapack works with Canonical matrixes, but it is assumed that matrix columns are stored Slice rows . -- Ilya
Mar 14 2018
On Tuesday, 13 March 2018 at 15:47:36 UTC, Martin Tschierschke wrote:I think for mathematics it is more important for easy handling, to be able to get the element of a matrix a_ij by a(i,j) and not only by a[i-1,j-1]. [snip]I didn't really address this in my other post. What you're talking about is 0-based or 1-based indexing. Most languages force you to choose, though there are a few languages that let you specify the type of index you want (Pascal, Chapel, and Ada come to mind). While I've thought that the way Chapel does domains is cool, I guess I never gave much thought into implementing the optional 0 or 1 based indexing in D. Now that I'm thinking about it, I don't see why it couldn't be implemented. For instance, there's nothing stopping you from writing a function like below that has the same behavior. auto access(T)(T x, size_t i, size_t j) { return(x.opIndex(i - 1, j - 1)); } What you really care about is the nice syntax. In that case, you could write an opIndex function that has different behavior based on a template parameter in Slice. Even something simple like below might work. auto ref opIndex(Indexes...)(Indexes indexes) safe if (isIndexSlice!Indexes) { static if (defaultIndexingBehavior) { return this.opIndex!(indexes.length)([indexes]); } else { Indexes newIndexes; foreach(i, e; indexes) { newIndexes[i] = e - 1; } return this.opIndex!(indexes.length)([newIndexes]); } } The time-consuming part is that you'd have to go through all of mir where it relies on opIndex and ensure that both sets of behavior work.
Mar 13 2018
On Tuesday, 13 March 2018 at 14:13:02 UTC, jmh530 wrote:On Tuesday, 13 March 2018 at 13:02:45 UTC, Ilya Yaroshenko wrote:matrix(j, i) == matrix[i, j] (reversed order)On Tuesday, 13 March 2018 at 12:23:23 UTC, jmh530 wrote:I'm not sure I understand what your syntax solution does...[...]Good point. Would matrix(j, i) syntax solve this issue? One of reasons to introduce Mat is API simplicity. ndslice has 3 compile time params. I hope we would have only type for Mat, like Mat!double.But I agree that there is a benefit from API simplicity. It would probably be easier to just say Mat is row-major and have another that is column-major (or have the options in ndslice). Nevertheless, it can't help to look at what other matrix libraries do. Eigen's (C++ library) Matrix class uses template arguments to set storage order (_Options). It looks like Eigen has six template arguments. https://eigen.tuxfamily.org/dox/classEigen_1_1Matrix.htmlInteresting. Maybe user can do: alias Mat = Matrix!(double, ColMajor); Fixed lengths can be considered too. Eigen has good ideas.
Mar 13 2018
On Tuesday, 13 March 2018 at 16:40:13 UTC, 9il wrote:On Tuesday, 13 March 2018 at 14:13:02 UTC, jmh530 wrote:Hopefully, I made the issue more clean in my response to Martin. The issue with row/column major order is one-part performance and one-part interoperability with other Matrix libraries. Perhaps most importantly for that syntax is to imagine how confusing a new user would find that syntax... A free-standing function, such as the simple one below, might be less confusing. Also, this is a solution for the Matrix type, but not so much the Slice type. auto reverseIndex(T)(T x, size_t i, size_t j) { return(x[j, i]); } The reverseIndex function is convenient, but you are looping through each element of the second column. Is there any performance advantage to using the reverseIndex function to do so? I suspect not. This is because you still have to jump around in memory because the underlying storage is row-major. You may not notice this effect when the CPU is able to pre-fetch the whole matrix and put it in cache, but as the matrix gets larger, then you can't fit it all in cache and it starts to matter more. Also, you might break vector operations. Personally, while I think it's important to think about, I also don't think it's a hugely pressing issue so long as the API is flexible enough that you can add the functionality in the future.[snip] I'm not sure I understand what your syntax solution does...matrix(j, i) == matrix[i, j] (reversed order)
Mar 13 2018
On Tuesday, 13 March 2018 at 03:37:36 UTC, 9il wrote:Hi All, The Dlang multidimensional range type, ndslice, is a struct composed a an iterator, lengths and possibly strides. It does not own memory and does not know anything about its content. ndslice is a faster and extended version of numpy.ndarray. After some work on commercial projects based on Lubeck[1] and ndslice I figure out what API and memory management is required to make Dlang super fast and math friendly in the same time. The concept is the following: 1. All memory is managed by a global BetterC thread safe ARC allocator. Optionally the allocator can be overloaded. 2. User can take an internal ndslice to use mir.ndslice API internally in functions. 2. auto matrixB = matrixA; // increase ARC 3. auto matrixB = matrixA.dup; // allocates new matrix 4. matrix[i] returns a Vec and increase ARC, matrix[i, j] returns a content of the cell. 5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C; `Mat` and other types will support any numeric types, PODlike structs, plus special overload for `bool` based on `bitwise` [2]. I have a lot of work for next months, but looking for a good opportunity to make Mat happen. For contributing or co-financing: Ilya Yaroshenko at gmail com Best Regards, Ilya [1] https://github.com/kaleidicassociates/lubeck [2] http://docs.algorithm.dlang.io/latest/mir_ndslice_topology.html#bitwise [3] http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html#gaeda3cbd99c8fb834a60a6412878226e1Well if D had: - a good matrix type, supporting all numeric types (absolutely crucial: including complex!) - *very important*: with operations syntax corresponding to the well-established standard mathematical conventions (including in labelling the elements!) - with superfast LU decomposition, det and similar (perhaps via LAPACK) - able to recognize/carry a tag for (anti)symmetric, hermitian, ... and exploit these in computations then I'd be more seriously considering switching to D. Your suggestion [4] that matrix[i] returns a Vec is perhaps too inflexible. What one needs sometimes is to return a row, or a column of a matrix, so a notation like matrix[i, ..] or matrix[.., j] returning respectively a row or column would be useful. I'd be happy to help out/give advice from the viewpoint of a scientist trying to "graduate" away from C++ without having to sacrifice performance.
Mar 12 2018
On Tuesday, 13 March 2018 at 05:36:06 UTC, J-S Caux wrote:Your suggestion [4] that matrix[i] returns a Vec is perhaps too inflexible. What one needs sometimes is to return a row, or a column of a matrix, so a notation like matrix[i, ..] or matrix[.., j] returning respectively a row or column would be useful.auto row = matrix[i]; // or matrix[i, 0 .. $]; auto col = matrix[0 .. $, j];
Mar 13 2018
On Tuesday, 13 March 2018 at 10:39:29 UTC, 9il wrote:On Tuesday, 13 March 2018 at 05:36:06 UTC, J-S Caux wrote:Some kind of improvement that replaces 0 .. $ with some shorter syntax has been brought up in the past. https://github.com/libmir/mir-algorithm/issues/53Your suggestion [4] that matrix[i] returns a Vec is perhaps too inflexible. What one needs sometimes is to return a row, or a column of a matrix, so a notation like matrix[i, ..] or matrix[.., j] returning respectively a row or column would be useful.auto row = matrix[i]; // or matrix[i, 0 .. $]; auto col = matrix[0 .. $, j];
Mar 13 2018
On Tuesday, 13 March 2018 at 12:09:04 UTC, jmh530 wrote:On Tuesday, 13 March 2018 at 10:39:29 UTC, 9il wrote:What I have settled on is Row(x,2), which returns a range that works with foreach. I tried x[_,2] to return Row(x,2) but didn't like reading it, so I went with x[_all,2] instead. Similarly for Col(x,2) and x[2,_all]. The exact form is bikeshedding and shouldn't make much difference. I use ByRow(x) and ByColumn(x) to iterate over the full matrix. IME, if you try to mix row-order and column-order, or 0-based indexing and 1-based indexing, it's too complicated to write correct code that interacts with other libraries. I think you need to choose one and go with it.On Tuesday, 13 March 2018 at 05:36:06 UTC, J-S Caux wrote:Some kind of improvement that replaces 0 .. $ with some shorter syntax has been brought up in the past. https://github.com/libmir/mir-algorithm/issues/53Your suggestion [4] that matrix[i] returns a Vec is perhaps too inflexible. What one needs sometimes is to return a row, or a column of a matrix, so a notation like matrix[i, ..] or matrix[.., j] returning respectively a row or column would be useful.auto row = matrix[i]; // or matrix[i, 0 .. $]; auto col = matrix[0 .. $, j];
Mar 13 2018
On Tuesday, 13 March 2018 at 21:04:28 UTC, bachmeier wrote:What I have settled on is Row(x,2), which returns a range that works with foreach. I tried x[_,2] to return Row(x,2) but didn't like reading it, so I went with x[_all,2] instead. Similarly for Col(x,2) and x[2,_all]. The exact form is bikeshedding and shouldn't make much difference. I use ByRow(x) and ByColumn(x) to iterate over the full matrix.This Row(x, 2) is essentially the same approach as Armadillo (it also has rows, cols, span). mir's select isn't quite the same thing. _all is interesting. mir's byDim that can iterate by both rows and columns.IME, if you try to mix row-order and column-order, or 0-based indexing and 1-based indexing, it's too complicated to write correct code that interacts with other libraries. I think you need to choose one and go with it.Fair enough. mir uses a row-order 0-based indexing approach by default. That's fine, I'm used to it at this point. What I was thinking about was that Slice's definition would change from struct Slice(SliceKind kind, size_t[] packs, Iterator) to struct Slice(SliceKind kind, size_t[] packs, Iterator, MemoryLayout layout = rowLayout) so that the user has control over changing it on a object by object basis. Ideally, they would keep it the same across the entire program. Nevertheless, I would still prefer it so that all functions in mir provide the same result regardless of what layout is chosen (not sure you can say that about switching to 0-based indexing...). The idea would be that whatever is built on top of it shouldn't need to care about the layout. However, due to cache locality, some programs might run faster depending on the layout chosen. With respect to interacting with libraries, I agree that a user should choose either row-order or column-order and stick to it. But what options are available for the user of a column-major language (or array library) to call mir if mir only makes available functions that handle row-major layouts? RCppArmadillo doesn't have an issue because both R and Armadillo are column-major. Going the other way, you'd probably know better than I would, but it looks like in embedr the only way I see to assign a D matrix to a RMatrix is by copying values. If a matrix was already in column-major form, then how easy much easier would it be to interact with R?
Mar 13 2018
On Tuesday, 13 March 2018 at 22:08:10 UTC, jmh530 wrote:With respect to interacting with libraries, I agree that a user should choose either row-order or column-order and stick to it. But what options are available for the user of a column-major language (or array library) to call mir if mir only makes available functions that handle row-major layouts? RCppArmadillo doesn't have an issue because both R and Armadillo are column-major. Going the other way, you'd probably know better than I would, but it looks like in embedr the only way I see to assign a D matrix to a RMatrix is by copying values. If a matrix was already in column-major form, then how easy much easier would it be to interact with R?With embedr, any data that is used by R has to be allocated by R. Therefore one would have to allocate R data (column-major) and then pass it to Mir. So I suppose you're right that it would be necessary to have some way to work with column-major data if you want to call Mir from R.
Mar 13 2018
On Tuesday, 13 March 2018 at 10:39:29 UTC, 9il wrote:On Tuesday, 13 March 2018 at 05:36:06 UTC, J-S Caux wrote:Some kind of improvement that replaces 0 .. $ with some shorter syntax has been brought up in the past. https://github.com/libmir/mir-algorithm/issues/53Your suggestion [4] that matrix[i] returns a Vec is perhaps too inflexible. What one needs sometimes is to return a row, or a column of a matrix, so a notation like matrix[i, ..] or matrix[.., j] returning respectively a row or column would be useful.auto row = matrix[i]; // or matrix[i, 0 .. $]; auto col = matrix[0 .. $, j];
Mar 13 2018
On Tuesday, 13 March 2018 at 12:16:27 UTC, jmh530 wrote:Some kind of improvement that replaces 0 .. $ with some shorter syntax has been brought up in the past. https://github.com/libmir/mir-algorithm/issues/53Sorry for double post.
Mar 13 2018
On Tuesday, 13 March 2018 at 03:37:36 UTC, 9il wrote: [...]5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C;[...] My answer is: Yes. If D with Lubeck would have such a convenient way to write matrix algebra, using well designed operator overloading, this might attract more users. Unfortunately my own time using this kind of math daily is long ago, but I would like to help as a tester. I will look in my master thesis and see if I may use Lubeck for the calculation done. I was doing eigenvalue with FEM and in an other project partial differential equation with a matrix based approximation schema... so this may bring my "gray cells" to work again :-) I was using a C++ lib with operator overloading resulting in quite convenient expressions. The point that Java has no operator overloading, was the trigger for me to dislike the language. :-)
Mar 13 2018
On 12 March 2018 at 20:37, 9il via Digitalmars-d <digitalmars-d puremagic.com> wrote:Hi All, The Dlang multidimensional range type, ndslice, is a struct composed a an iterator, lengths and possibly strides. It does not own memory and does not know anything about its content. ndslice is a faster and extended version of numpy.ndarray. After some work on commercial projects based on Lubeck[1] and ndslice I figure out what API and memory management is required to make Dlang super fast and math friendly in the same time. The concept is the following: 1. All memory is managed by a global BetterC thread safe ARC allocator. Optionally the allocator can be overloaded. 2. User can take an internal ndslice to use mir.ndslice API internally in functions. 2. auto matrixB = matrixA; // increase ARC 3. auto matrixB = matrixA.dup; // allocates new matrix 4. matrix[i] returns a Vec and increase ARC, matrix[i, j] returns a content of the cell. 5. Clever `=` expression based syntax. For example: // performs CBLAS call of GEMM and does zero memory allocations C = alpha * A * B + beta * C; `Mat` and other types will support any numeric types, PODlike structs, plus special overload for `bool` based on `bitwise` [2]. I have a lot of work for next months, but looking for a good opportunity to make Mat happen. For contributing or co-financing: Ilya Yaroshenko at gmail com Best Regards, IlyaI'd like to understand why implement a distinct vector type, rather than just a Nx1/1xN matrix? That kinds sounds like a hassle to me... although there is already precedent for it, in that a scalar is distinct from a 1x1 matrix (or a 1-length vector). I want to talk to you about how we interact with colours better... since in that world, a matrix can't just be a grid of independent storage cells. That will throw some spanners into the works, and I'd like to think that designs will support a classy way of expressing images as matrices of pixel data.
Mar 13 2018
On Wednesday, 14 March 2018 at 05:40:42 UTC, Manu wrote:I'd like to understand why implement a distinct vector type, rather than just a Nx1/1xN matrix?This is just and API quesiton of how elements of Nx1/1xN matrix should be accessed. E.g. should one specify one or two arguments to access an elementThat kinds sounds like a hassle to me... although there is already precedent for it, in that a scalar is distinct from a 1x1 matrix (or a 1-length vector).Yes, I have the same thoughts. The very high level of API abstraction may looks no good for end users.I want to talk to you about how we interact with colours better... since in that world, a matrix can't just be a grid of independent storage cells. That will throw some spanners into the works, and I'd like to think that designs will support a classy way of expressing images as matrices of pixel data.Just have replied to your letter. Thanks, Ilya
Mar 13 2018
On 03/14/2018 02:33 AM, 9il wrote:On Wednesday, 14 March 2018 at 05:40:42 UTC, Manu wrote:Armadillo (arma.sourceforge.net) has separate column and row vector types that allow using a single index. They can both convert to the matrix type.I'd like to understand why implement a distinct vector type, rather than just a Nx1/1xN matrix?This is just and API quesiton of how elements of Nx1/1xN matrix should be accessed. E.g. should one specify one or two arguments to access an element
Mar 14 2018
On Tuesday, 13 March 2018 at 03:37:36 UTC, 9il wrote:I have a lot of work for next months, but looking for a good opportunity to make Mat happen.Sounds great. In which repo will these changes happen?
Mar 20 2018
On Tuesday, 20 March 2018 at 12:10:08 UTC, Nordlöw wrote:On Tuesday, 13 March 2018 at 03:37:36 UTC, 9il wrote:It is planned to be in mir-algorithm, thoughI have a lot of work for next months, but looking for a good opportunity to make Mat happen.Sounds great. In which repo will these changes happen?
Sep 10 2018