39

I converted a pandas dataframe to R using the code below:

import pandas as pd
import pandas.rpy.common as com
import rpy2.robjects as ro
from rpy2.robjects.packages import importr

rdf = com.convert_to_r_dataframe(df)

How do I convert rdf back to a pandas.DataFrame?

df = f(rdf)
buhtz
  • 10,774
  • 18
  • 76
  • 149
Tampa
  • 75,446
  • 119
  • 278
  • 425

5 Answers5

31

Since rpy2 release 2.4.0 converting data frames back and forth between rpy2 and pandas is included as an optional module. With it, no need to convert explicitly, it will be done on the fly.

The documentation contains examples (also available as a Jupyter notebook - link available near the top of the page): https://rpy2.github.io/doc/latest/html/pandas.html#interoperability-with-pandas

Note: The original answer to this question recommended the following.

from rpy2.robjects import pandas2ri
pandas2ri.activate()

If wishing to convert explicitly for any reason, the functions are pandas2ri.py2ri() and pandas2ri.ri2py() (they were pandas2ri.pandas2ri() and pandas2ri.ri2pandas()).

Note: Since rpy2 release 3.3.0 explicit conversion is done as follows

import rpy2.robjects as ro

dt = pd.DataFrame()
# To R DataFrame
r_dt = ro.conversion.py2rpy(dt)
# To pandas DataFrame
pd_dt = ro.conversion.rpy2py(r_dt)

For more details check out this link.

Miguel Trejo
  • 5,913
  • 5
  • 24
  • 49
lgautier
  • 11,363
  • 29
  • 42
12

As suggested by lgautier, it can be done with pandas2ri.

Here is sample code for convert rpy dataframe (rdf) to pandas dataframe (pd_df):

from rpy2.robjects import pandas2ri

pd_df = pandas2ri.ri2py_dataframe(rdf)
tbekolay
  • 17,201
  • 3
  • 40
  • 38
huojun
  • 177
  • 1
  • 3
  • This operation replaces the index of the R dataframe with integers (if it was something other than integers to begin with). Anyone know how to keep the original index? – A. Slowey Feb 01 '19 at 17:00
9

Given your import, it appears it is:

com.convert_robj(rdf)

For example,

In [480]: dfrm
Out[480]:
           A          B  C
0   0.454459  49.916767  1
1   0.943284  50.878174  1
2   0.974856  50.335679  2
3   0.776600  50.782104  1
4   0.553895  50.084505  1
5   0.514018  50.719019  2
6   0.915413  50.513962  0
7   0.771571  49.859855  2
8   0.068619  49.409657  0
9   0.728141  50.945174  2
10  0.388115  47.879653  1
11  0.960172  49.680258  0
12  0.015216  50.067968  0
13  0.495024  50.286287  1
14  0.565954  49.909771  1
15  0.992279  49.009696  1
16  0.179934  49.554256  0
17  0.521243  47.854791  0
18  0.551241  51.076262  1
19  0.713271  49.418503  0
20  0.801716  50.660304  1

In [481]: rdfrm = com.convert_to_r_dataframe(dfrm)

In [482]: rdfrm
Out[482]:
<DataFrame - Python:0x14905cf8 / R:0x1600ee98>
[FloatVector, FloatVector, IntVector]
  A: <class 'rpy2.robjects.vectors.FloatVector'>
  <FloatVector - Python:0xf9d0b00 / R:0x140e2620>
[0.454459, 0.943284, 0.974856, ..., 0.551241, 0.713271, 0.801716]
  B: <class 'rpy2.robjects.vectors.FloatVector'>
  <FloatVector - Python:0xf9d0878 / R:0x125aa240>
[49.916767, 50.878174, 50.335679, ..., 51.076262, 49.418503, 50.660304]
  C: <class 'rpy2.robjects.vectors.IntVector'>
  <IntVector - Python:0x11fceef0 / R:0x13f0d918>
[       1,        1,        2, ...,        1,        0,        1]

In [483]: com.convert_robj(rdfrm)
Out[483]:
           A          B  C
0   0.454459  49.916767  1
1   0.943284  50.878174  1
2   0.974856  50.335679  2
3   0.776600  50.782104  1
4   0.553895  50.084505  1
5   0.514018  50.719019  2
6   0.915413  50.513962  0
7   0.771571  49.859855  2
8   0.068619  49.409657  0
9   0.728141  50.945174  2
10  0.388115  47.879653  1
11  0.960172  49.680258  0
12  0.015216  50.067968  0
13  0.495024  50.286287  1
14  0.565954  49.909771  1
15  0.992279  49.009696  1
16  0.179934  49.554256  0
17  0.521243  47.854791  0
18  0.551241  51.076262  1
19  0.713271  49.418503  0
20  0.801716  50.660304  1

With docs:

In [475]: com.convert_robj?
Type:       function
String Form:<function convert_robj at 0x13e85848>
File:       /mnt/epd/7.3-2_pandas0.12/lib/python2.7/site-packages/pandas/rpy/common.py
Definition: com.convert_robj(obj, use_pandas=True)
Docstring:
Convert rpy2 object to a pandas-friendly form

Parameters
----------
obj : rpy2 object

Returns
-------
Non-rpy data structure, mix of NumPy and pandas objects
ely
  • 74,674
  • 34
  • 147
  • 228
6

Use pandas to read an rpy2 dataframe, r_df. It will avoid the deprecation warning "FutureWarning: from_items is deprecated. Use DataFrame.from_dict(dict(items), ...) instead"

type(r_df) is "rpy2.robjects.vectors.DataFrame".
type(pd_df) is "pandas.core.frame.DataFrame"

pd_df = pd.DataFrame.from_dict({ key : np.asarray(r_df.rx2(key)) for key in r_df.names })

Rock Pereira
  • 471
  • 1
  • 4
  • 12
1

The other solutions seem to be outdated and did not work for me anymore.

From the docs, this is the current way to convert data from/to pandas to/from R objects.

import rpy2.robjects as ro
from rpy2.robjects import pandas2ri

From pandas to R:

with ro.default_converter + pandas2ri.converter:
  r_from_pd_df = ro.conversion.get_conversion().py2rpy(pd_df)

r_from_pd_df

From R to pandas:

with ro.default_converter + pandas2ri.converter:
  pd_from_r_df = ro.conversion.get_conversion().rpy2py(r_df)

pd_from_r_df

This only works in rpy2 version >=3.5.7.

shs
  • 3,683
  • 1
  • 6
  • 34
Martin Niederl
  • 649
  • 14
  • 32