digitalmars.D.announce - DasBetterR
- bachmeier (43/43) Jun 29 2023 I've been using D and R together for a decade. I wrote [a blog
- zjh (2/5) Jun 29 2023 Nice.
- Steven Schveighoffer (5/42) Jun 29 2023 This is very cool! I've never used R, but I have wanted to learn more
- Guillaume Piolat (2/6) Jun 30 2023 Super cool, congrats!
- jmh530 (6/7) Jun 30 2023 Glad you're continuing to do work on this front. There's a lot of
- bachmeier (29/36) Jun 30 2023 I assume you mean that you've allocated memory on the D side,
- jmh530 (5/20) Jun 30 2023 Unfortunate, but understood. Looking at the implementation for
- bachmeier (21/28) Jul 07 2023 I was wrong. They added custom allocators a while back, but
- jmh530 (10/33) Jul 07 2023 Cool.
I've been using D and R together for a decade. I wrote [a blog post for the D Blog](https://dlang.org/blog/2020/01/27/d-for-data-science-calling-r-from-d/) on the eve of the pandemic. I released the [embedrv2 library](https://github.com/bachmeil/embedrv2) in late 2021. It's useful for writing D functions that are called from R, using D's metaprogramming to write the necessary bindings for you. My programs usually take the opposite approach, where D is the primary language, and I call into R to fill in missing functionality. I've accumulated a large collection of code snippets to enable all kinds of things. The problem is that they were scattered across many projects, there was no consistency across programs, documentation didn't exist, and they were more or less useless to anyone other than me. [This Github repo](https://github.com/bachmeil/betterr) includes D modules, tests demonstrating most of the functionality, documentation, and some posts about how I do specific things. I'm sharing publicly all the things I've been doing in case it has value to anyone else. Examples of functionality: - Creating, accessing, and mutating R data structures, including vector, matrix, data frame, list, array, and time series types. Reference counting handles memory management. - Basic statistical functionality like calculating the mean. Many of these functions use Mir for efficiency. - Linear algebra - Random number generation and sampling - Parallel random number generation - Numerical optimization: direct access to the C libraries used by R's optim function - Quadratic programming - Passing D functions to R without creating a shared library. For example, you can use a D function as the objective function you pass to constrOptim for constrained optimization problems. [Project website](https://bachmeil.github.io/betterr/) There's more detail on the website, but I used the name "Better R" because the entirety of R is available inside your D program and you can use D to improve on it as much as you'd like. Feel free to hate the name. I was originally going to include all of this as part of embedrv2, but realized there was almost no overlap between the two use cases. Moreover, it would be strange to call R from D and call D functions from R in the same program. It simplifies things to keep them in different projects. If you try it and have problems, you can [create a discussion](https://github.com/bachmeil/betterr/discussions). You can also post in this forum, but I won't guarantee I'll see it.
Jun 29 2023
On Thursday, 29 June 2023 at 23:51:44 UTC, bachmeier wrote:I've been using D and R together for a decade. I wrote [a blog post for the D Blog](https://dlang.org/blog/2020/01/27/d-for-data-science-calling-r-from-d/) on the eve of the pandemic. I released the [embedrv2 library](https://github.com/bachmeil/embedrv2) in late 2021. It's useful for writing D functions that are called from R, using D's metaprogramming to write the necessary bindings for you.Nice.
Jun 29 2023
On 6/29/23 7:51 PM, bachmeier wrote:I've been using D and R together for a decade. I wrote [a blog post for the D Blog](https://dlang.org/blog/2020/01/27/d-for-data-science-calling-r-from-d/) on the eve of the pandemic. I released the [embedrv2 library](https://github.com/bachmeil/embedrv2) in late 2021. It's useful for writing D functions that are called from R, using D's metaprogramming to write the necessary bindings for you. My programs usually take the opposite approach, where D is the primary language, and I call into R to fill in missing functionality. I've accumulated a large collection of code snippets to enable all kinds of things. The problem is that they were scattered across many projects, there was no consistency across programs, documentation didn't exist, and they were more or less useless to anyone other than me. [This Github repo](https://github.com/bachmeil/betterr) includes D modules, tests demonstrating most of the functionality, documentation, and some posts about how I do specific things. I'm sharing publicly all the things I've been doing in case it has value to anyone else. Examples of functionality: - Creating, accessing, and mutating R data structures, including vector, matrix, data frame, list, array, and time series types. Reference counting handles memory management. - Basic statistical functionality like calculating the mean. Many of these functions use Mir for efficiency. - Linear algebra - Random number generation and sampling - Parallel random number generation - Numerical optimization: direct access to the C libraries used by R's optim function - Quadratic programming - Passing D functions to R without creating a shared library. For example, you can use a D function as the objective function you pass to constrOptim for constrained optimization problems. [Project website](https://bachmeil.github.io/betterr/)This is very cool! I've never used R, but I have wanted to learn more about such languages.There's more detail on the website, but I used the name "Better R" because the entirety of R is available inside your D program and you can use D to improve on it as much as you'd like. Feel free to hate the name.Awfull, awfull name... -Steve
Jun 29 2023
On Thursday, 29 June 2023 at 23:51:44 UTC, bachmeier wrote:If you try it and have problems, you can [create a discussion](https://github.com/bachmeil/betterr/discussions). You can also post in this forum, but I won't guarantee I'll see it.Super cool, congrats!
Jun 30 2023
On Thursday, 29 June 2023 at 23:51:44 UTC, bachmeier wrote:[snip]Glad you're continuing to do work on this front. There's a lot of great material explaining things, which is always good. It would be cool to have another version of the link below for using a mir Slice with R. https://bachmeil.github.io/betterr/setvar.html
Jun 30 2023
On Friday, 30 June 2023 at 16:14:48 UTC, jmh530 wrote:On Thursday, 29 June 2023 at 23:51:44 UTC, bachmeier wrote:I assume you mean that you've allocated memory on the D side, like this: ``` auto a = new double[24]; a[] = 1.6; Slice!(double*, 1) s = a.sliced(); ``` and you want to pass s to R for further analysis. Unfortunately, that will not work. R functions only work with memory R has allocated. It has a single struct type, so there's no way to pass s in this example to R. The best you can do right now is something like this: ``` auto a = Vector(24); Slice!(double*,1) s = a.ptr[0..24].sliced(); // Manipulate s // Send a as an argument to R functions ``` In other words, you let R allocate a, and then you work with the underlying data array as a slice. A way around this limitation would be to implement the same struct (SEXPREC) in D, while avoiding issues with R's garbage collector. That's a more involved problem than I've been willing to take on. If someone has the interest, the SEXPREC struct is defined here: https://github.com/wch/r-source/blob/060f8b64a3a8e489d8684c18b269eea63f182e73/src include/Defn.h#L184 and the internals are documented here: https://cran.r-project.org/doc/manuals/r-release/R-ints.html#SEXPs As much fun as it is to figure these things out, I have never had sufficient time or motivation to do so.[snip]Glad you're continuing to do work on this front. There's a lot of great material explaining things, which is always good. It would be cool to have another version of the link below for using a mir Slice with R. https://bachmeil.github.io/betterr/setvar.html
Jun 30 2023
On Friday, 30 June 2023 at 18:47:06 UTC, bachmeier wrote:[snip] I assume you mean that you've allocated memory on the D side, like this: ``` auto a = new double[24]; a[] = 1.6; Slice!(double*, 1) s = a.sliced(); ``` and you want to pass s to R for further analysis. Unfortunately, that will not work. R functions only work with memory R has allocated. It has a single struct type, so there's no way to pass s in this example to R.Unfortunate, but understood. Looking at the implementation for Vector, the implementation of the constructor and opAssign look like it has to copy the data over anyway.[snip] As much fun as it is to figure these things out, I have never had sufficient time or motivation to do so.Yeah, that seems like it would be a bit hairy to figure out.
Jun 30 2023
On Friday, 30 June 2023 at 16:14:48 UTC, jmh530 wrote:On Thursday, 29 June 2023 at 23:51:44 UTC, bachmeier wrote:I was wrong. They added custom allocators a while back, but didn't tell anyone. Actually, what I said before is technically correct. The SEXP struct itself still has to be allocated by R and managed by the R garbage collector. It's just that you can use a custom allocator to send a pointer to the data you've allocated, and once R is done with the data, it'll call the function you've provide to free the memory before destroying the SEXP struct that wraps it. I uploaded [an example here](https://github.com/bachmeil/betterr/blob/main/testing/testalloc.d). It's still a bit hackish because you need to adjust the pointer for a header R inserts when it allocates arrays. Adjusting by 10*double.sizeof works in this example, but "my test didn't segfault" doesn't exactly inspire confidence. Once I am comfortable with this solution, I'll do a new release of betterr. This'll be kind of a big deal if it works. For instance, if you want to use a database interface and D doesn't have one, you can use R's interface to that database without having R manage your project's memory. You could use any of the available R interfaces (databases, machine learning libraries, Qt, etc.)[snip]Glad you're continuing to do work on this front. There's a lot of great material explaining things, which is always good. It would be cool to have another version of the link below for using a mir Slice with R. https://bachmeil.github.io/betterr/setvar.html
Jul 07 2023
On Friday, 7 July 2023 at 20:33:08 UTC, bachmeier wrote:[snip] I was wrong. They added custom allocators a while back, but didn't tell anyone. Actually, what I said before is technically correct. The SEXP struct itself still has to be allocated by R and managed by the R garbage collector. It's just that you can use a custom allocator to send a pointer to the data you've allocated, and once R is done with the data, it'll call the function you've provide to free the memory before destroying the SEXP struct that wraps it. I uploaded [an example here](https://github.com/bachmeil/betterr/blob/main/testing/testalloc.d). It's still a bit hackish because you need to adjust the pointer for a header R inserts when it allocates arrays. Adjusting by 10*double.sizeof works in this example, but "my test didn't segfault" doesn't exactly inspire confidence. Once I am comfortable with this solution, I'll do a new release of betterr. This'll be kind of a big deal if it works. For instance, if you want to use a database interface and D doesn't have one, you can use R's interface to that database without having R manage your project's memory. You could use any of the available R interfaces (databases, machine learning libraries, Qt, etc.)Cool. The main thing I want to try is rstan. They have an interface called cmdstan that you can call from the command line that would be possible to use with D. The problem is that you have to write the data to a CSV file and then read it. So it would be kind of slow and I never got around to playing around with it in D. With your tool as it is, I would just have to copy the data in memory, which I would expect not to be as bad of an overhead as IO (but again haven't gotten around to do anything with it).
Jul 07 2023