www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - R and D interop with saucer

reply data pulverizer <data.pulverizer gmail.com> writes:
Hi,

Announcing my saucer project (https://github.com/chibisi/saucer) 
that allows D to be called from R in a similar way that Rcpp 
allows C++ code to be called from R. At the moment it targets 
only Linux machines but in time should gain Mac OS and Windows 
support.

Information about the current work on the project can be found on 
the Milestones page: 
https://github.com/chibisi/saucer/milestones, though there is 
also a TODO document.

If anyone in the community uses R, I'd appreciate bug reports 
filed in the repository issues section and suggestions here. The 
package is not yet on CRAN, but that is one of the milestones.

Thanks
Dec 29 2023
parent reply Sergey <kornburn yandex.ru> writes:
On Friday, 29 December 2023 at 23:05:12 UTC, data pulverizer 
wrote:
 Hi,

 Announcing my saucer project 
 (https://github.com/chibisi/saucer) that allows D to be called 
 from R in a similar way that Rcpp allows C++ code to be called 
 from R. At the moment it targets only Linux machines but in 
 time should gain Mac OS and Windows support.

 Information about the current work on the project can be found 
 on the Milestones page: 
 https://github.com/chibisi/saucer/milestones, though there is 
 also a TODO document.

 If anyone in the community uses R, I'd appreciate bug reports 
 filed in the repository issues section and suggestions here. 
 The package is not yet on CRAN, but that is one of the 
 milestones.

 Thanks
Hi! Thanks for open sourcing the project I remember some time ago I asked in Twitter about it :) I remember you previously mentioned, that you are familiar with EmbedR project, and that your library has different approach: it will be cool to have comparison or a note on GitHub page about differences with the second solution.
Dec 29 2023
parent reply data pulverizer <data.pulverizer gmail.com> writes:
On Friday, 29 December 2023 at 23:51:44 UTC, Sergey wrote:
 Hi! Thanks for open sourcing the project
 I remember some time ago I asked in Twitter about it :)

 I remember you previously mentioned, that you are familiar with 
 EmbedR project, and that your library has different approach: 
 it will be cool to have comparison or a note on GitHub page 
 about differences with the second solution.
That's a great point. I really can't remember what stage I was writing saucer when I became aware of EmbedR, but one thing I didn't understand was why it had to have pre-compiled weka? code within the package this is the `R.lib` file, which is a nonstarter security-wise. I also felt that such a project should have strong syntactic similarities with Rcpp to facilitate adoption, that it should have a nicer easier interface, that it would be a good learning experience for me, and that I could (probably) do a decent job at it. I have updated the package to include a reference to EmbedR outlining these points. Interestingly enough, there is a Rust package for R and D interop called embedr as well (https://docs.rs/extendr-api/latest/extendr_api/).
Dec 29 2023
next sibling parent data pulverizer <data.pulverizer gmail.com> writes:
On Saturday, 30 December 2023 at 00:50:54 UTC, data pulverizer 
wrote:
 I have updated the package to include a reference to EmbedR 
 outlining these points. Interestingly enough, there is a Rust 
 package for R and D interop called embedr as well 
 (https://docs.rs/extendr-api/latest/extendr_api/).
Obvious mistake here, the Rust package called extendr rather than embedr.
Jan 02
prev sibling parent reply bachmeier <no spam.net> writes:
On Saturday, 30 December 2023 at 00:50:54 UTC, data pulverizer 
wrote:

 That's a great point. I really can't remember what stage I was 
 writing saucer when I became aware of EmbedR, but one thing I 
 didn't understand was why it had to have pre-compiled weka? 
 code within the package this is the `R.lib` file, which is a 
 nonstarter security-wise. I also felt that such a project 
 should have strong syntactic similarities with Rcpp to 
 facilitate adoption, that it should have a nicer easier 
 interface, that it would be a good learning experience for me, 
 and that I could (probably) do a decent job at it.

 I have updated the package to include a reference to EmbedR 
 outlining these points. Interestingly enough, there is a Rust 
 package for R and D interop called embedr as well 
 (https://docs.rs/extendr-api/latest/extendr_api/).
Here is the updated version of embedr: https://github.com/bachmeil/embedrv2 The old version you're referencing is from ages ago. I don't know what you mean by Weka code. There was an old import library from when I tried to get embedding of R inside D to work on Windows. The updated library generates the R bindings for the user. There might be something useful there, and the code isn't very long: https://github.com/bachmeil/embedrv2/blob/main/inst/embedr/r.d#L1228 There's an extern_R attribute to specify which functions to export to R. Here's an example from the readme: ``` extern_R ar1irf(double alpha, double shock, int h) { double[] result; result ~= shock; foreach(ii; 1..h) { result ~= alpha * result.last; } return result; } mixin(exportRFunctions); double last(double[] v) { if (v.length > 0) { return v[$-1]; } else { return 0.0; } } ``` I honestly don't use D much as a replacement for Rcpp any longer. I mostly work in the other direction these days: https://bachmeil.github.io/betterr/ I've been adding Windows support lately so I can share my code with others. There are possibly things in there that are useful to you (whether D calls R or R calls D is not relevant for working with the API). For instance, if you want to pass a function to R without creating a shared library (one example being an objective function for optimization): https://bachmeil.github.io/betterr/funcptr.html Another is using a custom allocator to pass data to R even if it was allocated in D: https://github.com/bachmeil/betterr/blob/main/testing/testalloc.d There are pieces of knowledge I gained by debugging segfaults, like the fact that certain operations will change the pointer of an array and that sort of thing.
Jan 05
parent reply data pulverizer <data.pulverizer gmail.com> writes:
On Friday, 5 January 2024 at 16:16:49 UTC, bachmeier wrote:
 On Saturday, 30 December 2023 at 00:50:54 UTC, data pulverizer 
 wrote:

 Here is the updated version of embedr: 
 https://github.com/bachmeil/embedrv2

 The old version you're referencing is from ages ago. I don't 
 know what you mean by Weka code. There was an old import 
 library from when I tried to get embedding of R inside D to 
 work on Windows.

 The updated library generates the R bindings for the user. 
 There might be something useful there, and the code isn't very 
 long: 
 https://github.com/bachmeil/embedrv2/blob/main/inst/embedr/r.d#L1228 There's
an extern_R attribute to specify which functions to export to R.

 Here's an example from the readme:

 ```
  extern_R ar1irf(double alpha, double shock, int h) {
 	double[] result;
 	result ~= shock;
 	foreach(ii; 1..h) {
 		result ~= alpha * result.last;
 	}
 	return result;
 }
 mixin(exportRFunctions);

 double last(double[] v) {
 	if (v.length > 0) {
 		return v[$-1];
 	} else {
 		return 0.0;
 	}
 }
 ```

 I honestly don't use D much as a replacement for Rcpp any 
 longer. I mostly work in the other direction these days: 
 https://bachmeil.github.io/betterr/ I've been adding Windows 
 support lately so I can share my code with others.

 There are possibly things in there that are useful to you 
 (whether D calls R or R calls D is not relevant for working 
 with the API). For instance, if you want to pass a function to 
 R without creating a shared library (one example being an 
 objective function for optimization): 
 https://bachmeil.github.io/betterr/funcptr.html Another is 
 using a custom allocator to pass data to R even if it was 
 allocated in D: 
 https://github.com/bachmeil/betterr/blob/main/testing/testalloc.d There are
pieces of knowledge I gained by debugging segfaults, like the fact that certain
operations will change the pointer of an array and that sort of thing.
Thank you for your comments. I shall bear them in mind going forward. I should however make a correction and some clarifications. Regarding your `R.lib` file (embedrv2 and the previous version), I thought it might have something to do with Weka, the machine learning library - I got my wires crossed there. Perhaps I was reading something to do with the ML library before/while I was reading your repo. My point still stands though. I've never worked anywhere that would allow a developer to install unverified pre-compiled code from an online repo. It would pose too much of a security issue. As I explained my approach is to mirror the 'design language' of Rcpp/RInside/cpp11 libraries, because its a great approach, I'm familiar with it, and so are many others. A user (including myself) will be sensitive to on-boarding and usage friction, so I my library will present a clear, familiar, and easy to use interface. I didn't know about your betterr R package. I think it is a different approach from the one I would take. I think both Rcpp and in particular cpp11 have very good design approaches and continuous improvements to their design that gives me plenty to think about. They are at the cutting edge and are pushing the boundaries, and I think it would be cool to show that D can play in the same space with ease, finesse, and style. I'm pretty happy hacking the C API of R and have become quite familiar with its usage and foibles, though there is of course always more to learn. It's fun, as is using the power of D with R. As with many things it is a matter of patience, diligence, and continuous improvement. My focus is to bring the package up to scratch to a point where I am happy with it. To be clear, regarding saucer; at this stage (and for the foreseeable future), I'm monomaniacally focused on where I want to take the library.
Jan 09
parent reply Lance Bachmeier <no spam.net> writes:
On Tuesday, 9 January 2024 at 19:56:38 UTC, data pulverizer wrote:

I'd encourage you to approach this however you like, but for the 
sake of anyone else reading this, I want to correct a few points. 
I'm guessing you haven't spent any time understanding embedrv2.

 Regarding your `R.lib` file (embedrv2 and the previous 
 version), I thought it might have something to do with Weka, 
 the machine learning library - I got my wires crossed there. 
 Perhaps I was reading something to do with the ML library 
 before/while I was reading your repo. My point still stands 
 though. I've never worked anywhere that would allow a developer 
 to install unverified pre-compiled code from an online repo. It 
 would pose too much of a security issue.
That's not "unverified pre-compiled code". As I said, it's an import library for Windows, from an attempt long ago to call R from D on Windows. You don't call the .dll file directly on Windows, you call the .lib file. It's the same thing you do with OpenBLAS and many other popular libraries.
 As I explained my approach is to mirror the 'design language' 
 of Rcpp/RInside/cpp11 libraries, because its a great approach, 
 I'm familiar with it, and so are many others. A user (including 
 myself) will be sensitive to on-boarding and usage friction, so 
 I my library will present a clear, familiar, and easy to use 
 interface.
I'm familiar with Rcpp/RInside/cpp11. If you go to the CRAN page for RInside, you'll see I'm one of the authors. If you check out Dirk's 2013 book, you'll see that one of the sections in it was based on an example I gave him. I haven't done much with cpp11, but that's because I was already using D before it existed. embedrv2 does exactly the same thing. You write your D function and metaprogramming is used to create a wrapper that you can call from R without further modification.
 I didn't know about your betterr R package. I think it is a 
 different approach from the one I would take. I think both Rcpp 
 and in particular cpp11 have very good design approaches and 
 continuous improvements to their design that gives me plenty to 
 think about. They are at the cutting edge and are pushing the 
 boundaries, and I think it would be cool to show that D can 
 play in the same space with ease, finesse, and style.
Since you've clearly never used it and don't know how it works, why are you trashing it? I'll let anyone else judge how awkward, complicated and lacking in style it is. Here's the example from the landing page: ``` void main() { // Initialization startR(); // Generate 100 draws from a N(0.5, sd=0.1) distribution Vector x = rnorm(100, 0.5, 0.1); // Create a 3x2 matrix and fill the columns auto m = Matrix(3,2); Column(m,0) = [1.1, 2.2, 3.3]; Column(m,1) = [4.4, 5.5, 6.6]; // Calculate the inverse of the transpose auto m2 = solve(t(m)); // Modify the second and third elements of the first column of m m[1..$,0] = [7.7, 8.8]; // Choose x and y to minimize x^2 + y^2 // Use Nelder-Mead with initial guesses 3.5 and -5.5 auto nm = NelderMead(&f); OptimSolution sol = nm.solve([3.5, -5.5]); // Clean up closeR(); } extern(C) { double f(int n, double * par, void * ex) { return par[0]*par[0] + par[1]*par[1]; } } ``` That's ten lines of code to generate a vector of random numbers, create and fill a matrix, take the inverse of the transpose of the matrix, mutate the matrix, and solve a nonlinear optimization problem. I don't care that you're not using it. Have fun creating your own project. That doesn't excuse writing falsehoods about the work I've done.
Jan 09
parent data pulverizer <data.pulverizer gmail.com> writes:
Unfortunately, your statements are, by and large, simply wrong. 
Not to mention openly hostile.

On Tuesday, 9 January 2024 at 21:25:30 UTC, Lance Bachmeier wrote:
 That's not "unverified pre-compiled code". As I said, it's an 
 import library for Windows, from an attempt long ago to call R 
 from D on Windows. You don't call the .dll file directly on 
 Windows, you call the .lib file. It's the same thing you do 
 with OpenBLAS and many other popular libraries.
The `R.lib` file in this folder (https://github.com/bachmeil/embedrv2/tree/main/inst/embedr) is unverified. As a user, I have no real way of verifying what it is. As far as I am concerned as a responsible user, it posses a cyber security threat, and a habit of downloading such files onto my system would result in me getting hacked. You are not a recognised software distributor such as Microsoft or a recognised Linux distributor, therefore downloading such file in a business could result in exposing one's self to unlimited liability with regards to one's computer systems. OpenBlas library (https://github.com/OpenMathLib/OpenBLAS), does not contain precompiled code. But some of the releases do, and they have checksums (which your library does not), you can run procedures on your system to verify that they are the same as those on the repo. HOWEVER, there is trust here, and despite the fact that OpenBlas is a well recognised library, lots of workplaces would insist on compiling from source (including verifying the checksum) to ensure that they are getting what they expect.
 I'm familiar with Rcpp/RInside/cpp11. If you go to the CRAN 
 page for RInside, you'll see I'm one of the authors. If you 
 check out Dirk's 2013 book, you'll see that one of the sections 
 in it was based on an example I gave him. ...
This doesn't change anything.
 ... You write your D function and metaprogramming is used to 
 create a wrapper that you can call from R without further 
 modification.
I'm not sure what you are trying to say here, but my package does what it says it does.
 Since you've clearly never used it and don't know how it works, 
 why are you trashing it? I'll let anyone else judge how 
 awkward, complicated and lacking in style it is. Here's the 
 example from the landing page:
The first part of your first statement is right. I've never used betterr because I literally just found out about it, but I didn't trash it. I simply said that it's not how I would go about things. People have preferences about how they go about doing things. I can see that the way you do things is very different mine, and that is okay.
 I don't care that you're not using it. Have fun creating your 
 own project. That doesn't excuse writing falsehoods about the 
 work I've done.
I AM having fun with my implementation, but I'm NOT trafficking in falsehoods.
Jan 09