I am using R's MatchIt
package but calling it from Python via the pyr2
package.
On the R-side MatchIt gives me a complex result object including raw data and some additional statistic information. One of is a matrix I want to transform into a data set which I can do in R code like this
# R Code
m.out <- matchit(....)
m.sum <- summary(m.out)
# The following two lines should be somehow "translated" into
# Pythons rpy2
balance <- m.sum$sum.matched
balance <- as.data.frame(balance)
My problem is that I don't know how to implement the two last lines with Pythons rpy2
package. I am able to get m.out
and m.sum
with rpy2
.
See this MWE please
#!/usr/bin/env python3
import rpy2
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
import rpy2.robjects.pandas2ri as pandas2ri
import pydataset
if __name__ == '__main__':
# import
robjects.packages.importr('MatchIt')
# data
p_df = pydataset.data('respiratory')
p_df.treat = p_df.treat.replace({'P': 0, 'A': 1})
# Convert Panda data into R data
with robjects.conversion.localconverter(
robjects.default_converter + pandas2ri.converter):
r_df = robjects.conversion.py2rpy(p_df)
# Call R's matchit with R data object
match_out = robjects.r['matchit'](
formula=robjects.Formula('treat ~ age + sex'),
data=r_df,
method='nearest',
distance='glm')
# matched data
match_data = robjects.r['match.data'](match_out)
# Convert R data into Pandas data
with robjects.conversion.localconverter(
robjects.default_converter + pandas2ri.converter):
match_data = robjects.conversion.rpy2py(match_data)
# summary object
match_sum = robjects.r['summary'](match_out)
# x = robjects.r('''
# balance <- match_sum$sum.matched
# balance <- as.data.frame(balance)
#
# balance
# ''')
When inspecting the python object match_sum
I can't find anything like sum.matched
in it. So I have to "translate" the match_sum$sum.matched
somehow with rpy2
. But I don't know how.
An alternative solution would be to run everything as R code with robjects.r(''' # r code ...''')
. But in that case I don't know how to bring a Pandas data frame into that code.
EDIT: Be aware that in the MWE presented here the conversion from R objects into Python objects and vis-à-vis an outdated solution is used. Please see the answer below for a better one.