18

I'm trying to convert R-dataframe to Python Pandas DataFrame. I use the following code:

from rpy2.robjects import pandas2ri
pandas2ri.activate()
r_dataframe = r_function(my_dataframe['Numbers'])
print(r_dataframe)
python_dataframe = pandas2ri.ri2py(r_dataframe)

The above code works well in Jupyter Notebook (Anaconda). But if I run this code through a my_program.py file through the terminal, I get an error:

:~$ python3 my_program.py
Traceback (most recent call last):
  File "my_program.py", line 223, in <module>
    python_dataframe = pandas2ri.ri2py(r_dataframe)
AttributeError: module 'rpy2.robjects.pandas2ri' has no attribute 'ri2py'

Line of code: print(r_dataframe) shows right result in the terminal.

If I try to use code print(dir(pandas2ri)) in Jupyter Notebook I get ('ri2py'):

['DataFrame', 'FactorVector', 'FloatSexpVector', 'INTSXP', 'ISOdatetime', 'IntSexpVector', 'IntVector', 'ListSexpVector', 'ListVector', 'OrderedDict', 'POSIXct', 'PandasDataFrame', 'PandasIndex', 'PandasSeries', 'SexpVector', 'StrSexpVector', 'StrVector', 'Vector', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'activate', 'as_vector', 'conversion', 'converter', 'datetime', 'deactivate', 'dt_O_type', 'dt_datetime64ns_type', 'get_timezone', 'numpy', 'numpy2ri', 'original_converter', 'os', 'pandas', 'py2ri', 'py2ri_categoryseries', 'py2ri_pandasdataframe', 'py2ri_pandasindex', 'py2ri_pandasseries', 'py2ro', 'pytz', 'recarray', 'ri2py', 'ri2py_dataframe', 'ri2py_floatvector', 'ri2py_intvector', 'ri2py_listvector', 'ri2py_vector', 'ri2ro', 'rinterface', 'ro', 'warnings']

And if I try to use the same code print(dir(pandas2ri)) in Terminal I get ('rpy2py'):

['DataFrame', 'FactorVector', 'FloatSexpVector', 'ISOdatetime', 'IntSexpVector', 'IntVector', 'ListSexpVector', 'OrderedDict', 'POSIXct', 'PandasDataFrame', 'PandasIndex', 'PandasSeries', 'Sexp', 'SexpVector', 'StrSexpVector', 'StrVector', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'activate', 'as_vector', 'conversion', 'converter', 'datetime', 'deactivate', 'default_timezone', 'dt_O_type', 'get_timezone', 'is_datetime64_any_dtype', 'numpy', 'numpy2ri', 'original_converter', 'pandas', 'py2rpy', 'py2rpy_categoryseries', 'py2rpy_pandasdataframe', 'py2rpy_pandasindex', 'py2rpy_pandasseries', 'pytz', 'ri2py_vector', 'rinterface', 'rpy2py', 'rpy2py_dataframe', 'rpy2py_floatvector', 'rpy2py_intvector', 'rpy2py_listvector', 'tzlocal', 'warnings']

It turns out the developers have changed the name of the functions.

Denis
  • 357
  • 1
  • 3
  • 10
  • Check Python version in Jupyter and then command line. Likely, the two differs including corresponding `rpy2` modules. – Parfait May 05 '19 at 18:41

3 Answers3

2

Since no one bothered to write down the way to do it with newer versions of rpy2:

Conversion is done using a localconverter block which automatically converts from pandas dataframe to r dataframe and back.

import pandas as pd
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri

from rpy2.robjects.conversion import localconverter


pd_df = pd.DataFrame({'int_values': [1,2,3],
                      'str_values': ['abc', 'def', 'ghi']})

base = importr('base')
with localconverter(ro.default_converter + pandas2ri.converter):
  df_summary = base.summary(pd_df)
JonasV
  • 792
  • 5
  • 16
0

You are likely using documentation/code written for a different version of rpy2 than what you have installed.

If using the latest release, consider checking the documentation for it:

https://rpy2.github.io/doc/v3.0.x/html/generated_rst/pandas.html

lgautier
  • 11,363
  • 29
  • 42
0

For anyone having issues with localconverter, here is an alternative way to convert a df of type rpy2.robjects.vectors.ListVector to a pandas dataframe. This solution drops columns names

pd.Dataframe(np.array(df).reshape((nrows, ncols)))
iamthem
  • 11
  • 1