1

I'm trying to import an Rdata object, a named list. Most of the objects in that list work well. But one element gives me an error.

When I try to make a pandas dataframe from rpy2.robjects.vectors.DataFrame using:

SecondObject = rpy2.robjects.pandas2ri.ri2py_dataframe(r[Name][i][j][k])

I get this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 34: invalid start byte

The error is caused by the text in row 131, column 2, there is this text:

'Long forward passes (span angle 90\xb0)'

type(r[Name][i][j]) gives:
rpy2.robjects.vectors.DataFrame

The second column of that particular dataframe looks like:

r[Name][i][j][1]
Out[255]: 
R object with classes: ('character',) mapped to:
<StrVector - Python:0x13220e888 / R:0x7fa430ea3600>
['Air chal..., 'Ground c..., 'Ground c..., 'Air chal..., ..., 'Challeng..., 'Air chal..., 'Dribbles..., 'Tackles ...]

r[Name][i][j][1][129] succeeds, but when I try 130 I do get: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 34: invalid start byte

Everything I tried with decoding and encoding didn't work.

When I just give python this command, python does understand the symbol from R. :

b = "Long forward passes (span angle 90\xb0)"

b
Out[258]: 'Long forward passes (span angle 90°)'

Somebody who can help me what to do?

trotta
  • 1,232
  • 1
  • 16
  • 23
JosV
  • 11
  • 2
  • `b0` is degree simbol (`°`) on Latin-1. So when you are reading the file, you should tell python `encoding='latin-1'`. – Giacomo Catenazzi Feb 20 '19 at 13:26
  • Reading the file is done by: rf=r['load'](file2), where file2 = '/Path/To/File/FileName.Rdata'. How could I use encoding='latin-1' in there? Because I do understand your suggestion. – JosV Feb 20 '19 at 13:39
  • Hmm. with rpy2 it seems difficult (for a such common option). Look for `https://stackoverflow.com/questions/34024654/reading-rdata-file-with-different-encoding` – Giacomo Catenazzi Feb 20 '19 at 13:47
  • Thanks for this option. When I use: Encoding(df[,2]) = "latin1", what is stated on that site, then my value changed in "Long forward passes (span angle 90°)", so that works. But I have to work with a lot of files. So that's the reaseon I want to tackle this problem inside python and not first in Rstudio in all the files and export the files again. So I'm searching for an solution inside Python for this encode problem. – JosV Feb 20 '19 at 14:09
  • Try the dev branch of rpy2 (future rpy2-3.0.0) to be released soon. There were fixes around string encoding. – lgautier Feb 23 '19 at 20:49
  • 1
    rpy2's latest release is 3.0.1. It might be worth checking whether the problem is still present. – lgautier Mar 02 '19 at 02:14
  • 1
    I updated my rpy2 version from 2.9.4 to 3.0.1, but still the same error occurs.. @Igautier – JosV Mar 04 '19 at 10:16

0 Answers0