based on https://stackoverflow.com/a/44827220/1639834:
I have an R routine that I need to call from my python code in a dynamic way. For this I intended to use rpy2.
First the R code I would like to make use of from python (first time R user):
setting up dummy data to showcase R routine usage
set.seed(101)
data_sample <- c(5+ 3*rt(1000,df=5),
10+1*rt(10000,df=20))
num_components <- 2
the routine itself
library(teigen)
tt <- teigen(data_sample,
Gs=num_components,
scale=FALSE,dfupdate="numeric",
models=c("univUU")
)
df = c(tt$parameters$df)
mean = c(tt$parameters$mean)
scale = c(tt$parameters$sigma)
The arguments data_sample
and num_components
are computed dynamically by my python code where num_components
it just an integer and data_sample
a numpy array.
As end-goal I would like to have df
, mean
and scale
back in "python world" as lists or numpy arrays to further process them and use them down the road in my program logic.
My first experiment to tackle this with rpy2 so far:
import rpy2
from rpy2.robjects.packages import importr
from rpy2 import robjects as ro
numpy_t_mix_samples = get_student_t_data(n_samples=10000)
r_t_mix_samples = ro.FloatVector(numpy_t_mix_samples)
teigen = importr('teigen')
rres = teigen.teigen(r_t_mix_samples, Gs=2, scale=False, dfupdate="numeric", models=c("univUU"))
Here the argument for Gs
are still hardcoded but should as laid out above later be dynamic.
rres then prints mostly incomprehensible output (i gues because it is not being casted yet properly with rpy2):
R object with classes: ('teigen',) mapped to:
<ListVector - Python:0x11e3fdc48 / R:0x7ff7d229dcb0>
[Float..., Matrix, ListV..., ..., Float..., ListV..., ListV...]
iter: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x11e3fdd08 / R:0x7ff7cced0a28>
[156.000000]
fuzzy: <class 'rpy2.robjects.vectors.Matrix'>
R object with classes: ('matrix',) mapped to:
<Matrix - Python:0x11e3fd8c8 / R:0x118e78000>
[0.000000, 0.917546, 0.004050, ..., 0.077300, 0.076273, 0.091252]
R object with classes: ('teigen',) mapped to:
<ListVector - Python:0x11e3fdc48 / R:0x7ff7d229dcb0>
[Float..., Matrix, ListV..., ..., Float..., ListV..., ListV...]
...
iter: <class 'rpy2.robjects.vectors.FloatVector'>
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x11d632508 / R:0x7ff7cfa81658>
[-25365.912426]
R object with classes: ('teigen',) mapped to:
<ListVector - Python:0x11e3fdc48 / R:0x7ff7d229dcb0>
[Float..., Matrix, ListV..., ..., Float..., ListV..., ListV...]
R object with classes: ('teigen',) mapped to:
<ListVector - Python:0x11e3fdc48 / R:0x7ff7d229dcb0>
[Float..., Matrix, ListV..., ..., Float..., ListV..., ListV...]
All in all I am looking to have the same results as in the original R example in the first code box, just that the df, mean and scale variables are python lists/numpy arrays. The fact that I don't know R at all makes using rpy2 quite difficult and maybe there is more elegant way to call this routine dynamically and get the results back in python world.