www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D-Link with R/MATLAB/Julia/SQL

reply "Siavash Babaei" <siavash.babaei gmail.com> writes:
Hi,

I primarily work in statistical modelling of financial data (risk 
modelling). I am at a point when I need to think about developing 
applications in addition to data analysis and modelling.
My primary concern is whether or not I can run/call-on programmes 
that I have written in R/MATLAB/Julia. Accessing a database is 
also a concern.
Now, I know C++/Visual Studio can handle this but I would like 
something more reliable and exciting (!better!) and I would like 
to know if I am in the right place.

Thank You
Dec 05 2013
next sibling parent reply "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Thursday, 5 December 2013 at 11:13:17 UTC, Siavash Babaei 
wrote:
 Hi,

 I primarily work in statistical modelling of financial data 
 (risk modelling). I am at a point when I need to think about 
 developing applications in addition to data analysis and 
 modelling.
 My primary concern is whether or not I can run/call-on 
 programmes that I have written in R/MATLAB/Julia. Accessing a 
 database is also a concern.
 Now, I know C++/Visual Studio can handle this but I would like 
 something more reliable and exciting (!better!) and I would 
 like to know if I am in the right place.

 Thank You
D has the ability to use c and (a lot of c++) libraries. Currently building D as a shared library isn't the best however it seems MATLAB's code can be compiled into a shared library [1]. For both you will need to write a binding for it. DerelictUtil is a library that can help binding, getting function pointers to shared libraries [2]. Databases we have not too much in the way of that. Vibe-d has good wrappers around Mongo and Redis [3]. I have made bindings to OpenDBX [4] which support multiple backends [5]. However that is just the c api so nothing nice about it. There is also other libraries which I haven't mentioned. Because I haven't tried them. You can see a list of projects currently being maintained using dub (build manager) [6]. If you need any help with this, let us know! [1] http://www.mathworks.com.au/support/compilers/interface.html [2] https://github.com/DerelictOrg/DerelictUtil [3] https://github.com/rejectedsoftware/vibe.d [4] https://github.com/rikkimax/Derelict_Extras---OpenDBX [5] http://www.linuxnetworks.de/doc/index.php/OpenDBX/Support [6] http://code.dlang.org
Dec 05 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-12-05 13:03, Rikki Cattermole wrote:

 Databases we have not too much in the way of that. Vibe-d has good
 wrappers around Mongo and Redis [3]. I have made bindings to OpenDBX [4]
 which support multiple backends [5]. However that is just the c api so
 nothing nice about it.
 There is also other libraries which I haven't mentioned. Because I
 haven't tried them. You can see a list of projects currently being
 maintained using dub (build manager) [6].
Here's a list of libraries as well: http://wiki.dlang.org/Libraries_and_Frameworks -- /Jacob Carlborg
Dec 05 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 5 December 2013 at 11:13:17 UTC, Siavash Babaei 
wrote:
 Hi,

 I primarily work in statistical modelling of financial data 
 (risk modelling). I am at a point when I need to think about 
 developing applications in addition to data analysis and 
 modelling.
 My primary concern is whether or not I can run/call-on 
 programmes that I have written in R/MATLAB/Julia. Accessing a 
 database is also a concern.
 Now, I know C++/Visual Studio can handle this but I would like 
 something more reliable and exciting (!better!) and I would 
 like to know if I am in the right place.

 Thank You
Anything that can be compiled to a library conforming to the native C ABI can be called directly from D. This is definitely possible with matlab code, you'd just have to write a D declaration for the function you want to call. Julia doesn't have static compilation yet, or an official C API as far as I know, so you'd have to do some form of IPC. If you're only calling a function once every so often you could probably get away with communicating with files. I think Julia supports pipes, which would be better but possibly more work. See std.process R has the .c and .call interfaces with relevant C headers. I don't know of any D port. Your options: a) port as much or as little as you need of the R headers to D. b) write a tiny wrapper function in C for each R function you need, along with a D declaration. Call the C wrapper from D. TL;DR: Matlab: Should work perfectly, almost works-out-of-the-box. Julia: You have to ride the convenience/overhead tradeoff curve. R: If you know how to use the .c or .call C interface then it should be trivial to interface with D.
Dec 05 2013
prev sibling next sibling parent "seany" <seany uni-bonn.de> writes:
On Thursday, 5 December 2013 at 11:13:17 UTC, Siavash Babaei 
wrote:
 Hi,

 I primarily work in statistical modelling of financial data 
 (risk modelling). I am at a point when I need to think about 
 developing applications in addition to data analysis and 
 modelling.
 My primary concern is whether or not I can run/call-on 
 programmes that I have written in R/MATLAB/Julia. Accessing a 
 database is also a concern.
 Now, I know C++/Visual Studio can handle this but I would like 
 something more reliable and exciting (!better!) and I would 
 like to know if I am in the right place.

 Thank You
I have the similar issue (climate modelling), I have a C++ manager where i can issue commands (e.g. the command "writemodel" without codes will start a latex interface, the command "parsemodel" also without quotes, always without quotes, will parse the model, into data that a Scilab / R script can work on, then "drivemodel" will drive the model, i.e. run the Scilab / R / Octave etc as I run it. However, all these processes, the writing, the modell driving etc, can talk to each other by means of a file, which acts as a messagebus. Each program can demand an immediate action (e.g. the writemodel can demand an immediate break, update and rerun of a currently driven model) . I have been working on d program managing (encoding, decoding, queueing and popoing) all these messages between such processes. Sounds trivial, but the commands are to be ordered according to their severity of demand, and in case of multiple commands having same severity you want them to be excuted at the same physical time in multiple threads / cores, and so on. Therefore, the whole indexing / sorting is implemented using a very generalized version of surreal numbers, whos sole purpose is to describe a (possibly self similar, not necessarily one dimensional) order.
Dec 05 2013
prev sibling next sibling parent "tn" <no email.com> writes:
On Thursday, 5 December 2013 at 11:13:17 UTC, Siavash Babaei 
wrote:
 Hi,

 I primarily work in statistical modelling of financial data 
 (risk modelling). I am at a point when I need to think about 
 developing applications in addition to data analysis and 
 modelling.
 My primary concern is whether or not I can run/call-on 
 programmes that I have written in R/MATLAB/Julia. Accessing a 
 database is also a concern.
 Now, I know C++/Visual Studio can handle this but I would like 
 something more reliable and exciting (!better!) and I would 
 like to know if I am in the right place.

 Thank You
For Matlab there is also this: https://github.com/Trass3r/MatD (I haven't tried it.)
Dec 05 2013
prev sibling parent reply "bachmeier" <nospam nospam.com> writes:
On Thursday, 5 December 2013 at 11:13:17 UTC, Siavash Babaei 
wrote:
 Hi,

 I primarily work in statistical modelling of financial data 
 (risk modelling). I am at a point when I need to think about 
 developing applications in addition to data analysis and 
 modelling.
 My primary concern is whether or not I can run/call-on 
 programmes that I have written in R/MATLAB/Julia. Accessing a 
 database is also a concern.
 Now, I know C++/Visual Studio can handle this but I would like 
 something more reliable and exciting (!better!) and I would 
 like to know if I am in the right place.

 Thank You
I have been using R and D together for several months. It is easy to embed either inside the other. Unfortunately I have not found the time to write up any decent notes about it. With the latest release of DMD, my workflow is to create shared libraries with D and call them from R. Using the .C interface from R is simple, but not much fun. I wrote up a few notes here: http://lancebachmeier.com/wiki/doku.php?id=rlang:calling-d What I've been working on lately is a .Call-based interface that provides some of the same convenience as Rcpp. I call into Gretl's C API to get a wide range of econometric functions. It works well enough that I'm going to give a lecture to my graduate students on calling D functions from R next week. To embed R inside D, which is maybe more what you're after, I created a simple C interface to RInside. That works well. Unfortunately I don't have notes on that either. I'll be happy to share more if any of this will help you.
Dec 05 2013
next sibling parent "Siavash Babaei" <siavash.babaei gmail.com> writes:
One thing that professional programmers often miss is that 
by-and-large, many of the users of programming languages are not 
professional programmers by trade but they require them as tools 
to do their job and as such may not be as well-versed in many of 
the nooks-and-crannies. Hence, us-mere-mortals rely on ready-made 
libraries/packages that you master programmers design and 
implement in order to achieve our goals.
On another note, seeing as data mining and analysis is going to 
be probably one of the most important functions of any business 
for a foreseeable future, I think any decent language should have 
a more-or-less easy and straightforward way of 
interfacing/linking to languages/packages that handle data and 
their analysis, e.g., it should be fairly easy to access 
databases, spreadsheets, and the major data analysis tools like 
R/MATLAB/OCTAVE. Furthermore, one should not need to write code 
in another language like C in order to do that, i.e., D should be 
more or less self-sufficient.
The whole idea behind D is amazing and the language’s objectives 
seem very promising, but should it be my first real attempt at 
learning a general purpose programming language?! I am still 
baffled …
Dec 07 2013
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 06/12/13 03:16, bachmeier wrote:
 I have been using R and D together for several months. It is easy to embed
 either inside the other. Unfortunately I have not found the time to write up
any
 decent notes about it. With the latest release of DMD, my workflow is to create
 shared libraries with D and call them from R. Using the .C interface from R is
 simple, but not much fun. I wrote up a few notes here:

 http://lancebachmeier.com/wiki/doku.php?id=rlang:calling-d
That's very cool, and also something of a relief to see, as I think interfacing nicely with R is an essential feature for a programming language in some domains of research.
 What I've been working on lately is a .Call-based interface that provides some
 of the same convenience as Rcpp. I call into Gretl's C API to get a wide range
 of econometric functions. It works well enough that I'm going to give a lecture
 to my graduate students on calling D functions from R next week.

 To embed R inside D, which is maybe more what you're after, I created a simple
C
 interface to RInside. That works well. Unfortunately I don't have notes on that
 either.

 I'll be happy to share more if any of this will help you.
I think that you would be doing everyone a big favour if you would provide a good writeup on this. Can I also suggest making a talk proposal to DConf 2014 about this work? Question: I remember a colleague who had worked on linking a C library into R telling me that he'd had to make copies of data before passing into R, because R expects variables to be immutable. Is there anything that you can do with D's support for immutability that lets you avoid this? Or are you constrained by the fact that you need to use an extern(C) API?
Dec 07 2013
next sibling parent reply "Siavash Babaei" <siavash.babaei gmail.com> writes:
It seems that I like the whole idea of D a bit too much to act 
conservatively. So I will start learning it and hope that things 
will get better by then. Although, I have to insist again, having 
"something" to call programmes/functions in R and MATLAB easily 
and readily is a must for D.
Julia is also an upstart and very intriguing language and seems 
to have a solid basis, so it might not be a bad idea to 
collaborate with their Devs.
On the business side, it is probably not the best idea to `sell' 
the best and one of the very few learning materials when you are 
trying to compete with a monster like C++.
PS- Thank you for your help and once I have learned the language, 
I will perhaps ask for detailed help ...
Dec 09 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 9 December 2013 at 16:31:54 UTC, Siavash Babaei wrote:
 It seems that I like the whole idea of D a bit too much to act 
 conservatively. So I will start learning it and hope that 
 things will get better by then. Although, I have to insist 
 again, having "something" to call programmes/functions in R and 
 MATLAB easily and readily is a must for D.
 Julia is also an upstart and very intriguing language and seems 
 to have a solid basis, so it might not be a bad idea to 
 collaborate with their Devs.
 On the business side, it is probably not the best idea to 
 `sell' the best and one of the very few learning materials when 
 you are trying to compete with a monster like C++.
 PS- Thank you for your help and once I have learned the 
 language, I will perhaps ask for detailed help ...
Just a thought: Matlab compiler (http://www.mathworks.co.uk/products/compiler/description2.html) produces a C header file for the library it generates. There is a tool DStep (https://github.com/jacob-carlborg/dstep) that can convert C headers to D modules. So, the following steps should work: generate the C header and shared library from matlab run DStep on the header import the generated d module use the relevant matlab functions you want in your program link with the shared library when compiling.
Dec 09 2013
parent "Siavash Babaei" <siavash.babaei gmail.com> writes:
Thank you ... load of my mind
Dec 10 2013
prev sibling parent reply "bachmeier" <no spam.com> writes:
On Saturday, 7 December 2013 at 17:08:49 UTC, Joseph Rushton 
Wakeling wrote:
 On 06/12/13 03:16, bachmeier wrote:
 I have been using R and D together for several months. It is 
 easy to embed
 either inside the other. Unfortunately I have not found the 
 time to write up any
 decent notes about it. With the latest release of DMD, my 
 workflow is to create
 shared libraries with D and call them from R. Using the .C 
 interface from R is
 simple, but not much fun. I wrote up a few notes here:

 http://lancebachmeier.com/wiki/doku.php?id=rlang:calling-d
That's very cool, and also something of a relief to see, as I think interfacing nicely with R is an essential feature for a programming language in some domains of research.
 What I've been working on lately is a .Call-based interface 
 that provides some
 of the same convenience as Rcpp. I call into Gretl's C API to 
 get a wide range
 of econometric functions. It works well enough that I'm going 
 to give a lecture
 to my graduate students on calling D functions from R next 
 week.

 To embed R inside D, which is maybe more what you're after, I 
 created a simple C
 interface to RInside. That works well. Unfortunately I don't 
 have notes on that
 either.

 I'll be happy to share more if any of this will help you.
I think that you would be doing everyone a big favour if you would provide a good writeup on this. Can I also suggest making a talk proposal to DConf 2014 about this work?
I will write something up as soon as I can, but it might be a while before that happens. I will also share my code for the .Call interface. Hopefully I will get feedback from someone more knowledgeable about D. I'll leave DConf to the experts. I'm an economist who taught himself to program.
 Question: I remember a colleague who had worked on linking a C 
 library into R telling me that he'd had to make copies of data 
 before passing into R, because R expects variables to be 
 immutable.  Is there anything that you can do with D's support 
 for immutability that lets you avoid this?  Or are you 
 constrained by the fact that you need to use an extern(C) API?
Your colleague might well be right, but I'm not familiar with that issue. In the other direction (R to C) you want to copy the data before mutating. Suppose you have x <- c(1,2,3) y <- x .Call("foo", y) R does a lazy copy of x into y, such that y is a pointer to x until the copy is actually needed. In this example, when you pass y from R to C, all you're passing is a pointer to the struct (SEXP). On the C side, you can do anything you want with that pointer, and it will modify both y and x unless you make a copy of y. You can create an SEXP on the C side and pass it into R, but that's a bit clumsy so I avoid it. Whether I'm embedding R in C/C++/D or vice versa I do the allocation in R. Moving data between R and D is nothing more than passing a pointer around.
Dec 09 2013
next sibling parent reply "bachmeier" <no spam.com> writes:
Here is a related link that is relevant to D:

https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20131209/ec911711/attachment.pl

It's an announcement that he's moving to Julia, but part of the 
problem is

"Eigen is a complex code base using what is called "template
meta-programming" in C++.  Making modifications to such code can 
be
difficult.  I can't claim to fully understand all the details in 
Eigen and
in Rcpp.  I am a user of these code bases, not a developer."

That's my experience as well. C++ is fine for simple things, but 
once you cross the line from simple to intermediate, all hell 
breaks loose. D is IMO a much better choice for working with R 
because it's possible to understand what is going on inside the D 
code even if you have less than 10 years of D programming 
experience.
Dec 09 2013
parent "Siavash Babaei" <siavash.babaei gmail.com> writes:
Since it is not your (no-one specifically) job to do so and you 
are probably not as `expert', it is likely that you will mess 
things up. It is also time consuming ... The concept of "division 
of labour" has been around for several thousand years, after all.
C++, R, MATLAB, and the like have really aged and besides a large 
community/ecosystem, you cannot consider them good programming 
languages anymore. It would be nice to see more collaboration 
between D and Julia, while providing support for other necessary 
languages, so that things get done in the mean time.
Dec 10 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 09/12/13 21:22, bachmeier wrote:
 I will write something up as soon as I can, but it might be a while before that
 happens. I will also share my code for the .Call interface. Hopefully I will
get
 feedback from someone more knowledgeable about D.

 I'll leave DConf to the experts. I'm an economist who taught himself to
program.
It's always worth remembering that while there are probably plenty of people here who are more expert in D, you may be much rarer in having practical experience of R and in using it effectively in combination with D. The other thing to remember is that it's valuable to highlight diverse use-cases for the language. Don't be shy about things like this; simply publicizing the challenges of what you're doing is valuable in itself. So, don't assume that DConf would not want to hear about your work :-)
 Your colleague might well be right, but I'm not familiar with that issue. In
the
 other direction (R to C) you want to copy the data before mutating. Suppose you
 have

 x <- c(1,2,3)
 y <- x
 .Call("foo", y)

 R does a lazy copy of x into y, such that y is a pointer to x until the copy is
 actually needed. In this example, when you pass y from R to C, all you're
 passing is a pointer to the struct (SEXP). On the C side, you can do anything
 you want with that pointer, and it will modify both y and x unless you make a
 copy of y.

 You can create an SEXP on the C side and pass it into R, but that's a bit
clumsy
 so I avoid it. Whether I'm embedding R in C/C++/D or vice versa I do the
 allocation in R. Moving data between R and D is nothing more than passing a
 pointer around.
Ahh, OK. Good to know for future reference -- thank you!
Dec 10 2013