3

I'm trying to use python to define a variable as a string, specifically, a path to a file. I then want python to pass that string to an R variable. I then want to use R's read.table function to write the contents of that file to the variable in R as a table. I'm using rpy2 and r.assign to accomplish this, but I'm getting no where. Any help would be appreciated! The error message I receive is pasted below the code.

import os
import sys
from rpy2.robjects import r
import rpy2.robjects as robjects
from rpy2.robjects import *

r = robjects.r

known_genes = str(raw_input('Path to file containing gene coordinates? '))
anno_genes = str(raw_input('Path to gene:ilmn ID mapping file? '))
ms_meta = str(raw_input('Path to GWAS MS Meta Data file? '))
SNP_ID = str(raw_input('SNP Identifier? '))
SNP_dir = str(raw_input('SNP results directory? '))


r.assign('known.genes', known_genes)
r.assign('anno.genes', anno_genes)
r.assign('ms.meta', ms_meta)
r.assign('SNP', SNP_ID)
r.assign('SNP_dir', SNP_dir)

knowngenes = r('read.table("known.genes", header=T, as.is=T)')
annogenes = r('read.table("anno.genes", header=T, as.is=T)')



Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'known.genes': No such file or directory
Traceback (most recent call last):
  File "plot.py", line 24, in <module>
    knowngenes = r('read.table("known.genes", header=T, as.is=T)')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/__init__.py", line 240, in __call__
    res = self.eval(p)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/functions.py", line 86, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/rpy2-2.3.8-py2.7-macosx-10.6-intel.egg/rpy2/robjects/functions.py", line 35, in __call__
    res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in file(file, "rt") : cannot open the connection

RESOLVED:

knowngenes = r('read.table("known.genes", header=T, as.is=T)')

should simply be

knowngenes = r('read.table(known.genes, header=T, as.is=T)')

Python was interpreting the "" as a string (even though R would interpret them as a variable). As a result, Python was passing the string "known.genes" to the r function, as opposed to the "path to file" stored in known.genes.

user2977350
  • 65
  • 2
  • 8
  • Looks like it can't find your `known.genes` file. Where is it located? – aIKid Nov 11 '13 at 00:36
  • @alKid: The known.genes file is located in the path submitted by the user. The path is stored into known_genes using Python. Then using rpy2, I'm trying to pass the information in known_genes (the path to the file) to the R variable known.genes using r.assign. Thanks for taking the time to look at this! – user2977350 Nov 11 '13 at 00:56

2 Answers2

2
knowngenes = r('read.table("known.genes", header=T, as.is=T)')

should simply be

knowngenes = r('read.table(known.genes, header=T, as.is=T)')

Python was interpreting the "" as a string (even though R would interpret them as a variable). As a result, Python was passing the string "known.genes" to the r function, as opposed to the "path to file" stored in known.genes.

user2977350
  • 65
  • 2
  • 8
0

The RRuntimeError exception indicates an error happening when running R, and the message here tells that it cannot open a connection (file)

There is probable a confusion between variable names and content of variables. When writing

knowngenes = r('read.table("known.genes", header=T, as.is=T)')

it is strictly equivalent to writing in R

knowngenes = read.table("known.genes", header=T, as.is=T) 

and the code you have before that tells that the name of the file is in a variable called known.genes.

I'd suggest to rewrite code like this (and minimize the number of objects you are storing in the R global environment):

from rpy2.robjects.packages import importr
utils = importr('utils')

mydataframe = utils.read_table(myfilename, header=True, as_is=True)
lgautier
  • 11,363
  • 29
  • 42
  • Thank you for explaining the RRuntimeError @lgautier. I tried using your code and got known.genes = utils.read_table(known_genes, header=True, as.is=True) ^ SyntaxError: invalid syntax What I attempted to do was create a mechanism through which users could specify any given path to a file. I was hoping that the entire path of the file would be stored in known_genes through the python code, the path be passed as string to R through r.assign, and read.table would open the string and read the contents of the file as a table in R – user2977350 Nov 11 '13 at 01:04
  • @user2977350 Typo. '.' is not a syntactically valid character for symbol names. Fixed in the answer. – lgautier Nov 11 '13 at 02:45