0

When did the problem arise?

I was trying to use polr() method of R (via rpy2) for ordinal regression analysis using the following statement. In that statement, "Number of Steps" is my dependent variable. When I set (in the column header and also in the statement) underscore replacing the spaces (i.e. Number_of_Steps), everything works fine.

model = mass.polr('as.factor(Number of Steps) ~ Var2',
                  data=df_data, method='logistic',
                  Hess = True)
# Here, mass = importr('MASS')

However, with spaces (i.e. Number of Steps), I get the following error.

RRuntimeError: Error in parse(text = x, keep.source = FALSE) :
:1:17: unexpected symbol 1: as.factor(Number of

How did I try to solve the problem?

I have searched on google and also checked different questions in SO (e.g. this one) related to this problem. However, still, I do not find the solution of this problem.

Then, my question

How can I use space separated variable name (i.e. column header) in as.factor(variable name) during use of mass.polr()?

Thanks for reading!

lgautier
  • 11,363
  • 29
  • 42
Md. Sabbir Ahmed
  • 850
  • 8
  • 22

1 Answers1

1

This is not specific to rpy2. In R, one can use backticks ( ` ) to delimitate a symbol that contains spaces.

Assuming your example is otherwise correct, the following should do it:

model = mass.polr('as.factor(`Number of Steps`) ~ Var2',
                  data=df_data, method='logistic',
                  Hess=True)

Demonstration:

import rpy2.robjects as ro

# Get an R data frame with a column name that has
# a space.
dataf = ro.r("""
require("MASS")
cbind(housing, "My Sat"=housing$Sat)
""")

print('column names:')
print(tuple(dataf.colnames))

from rpy2.robjects.packages import importr
mass = importr('MASS')
house_plr = mass.polr(
    ro.Formula('as.factor(`My Sat`) ~ Infl + Type + Cont'),
    data = dataf
)
lgautier
  • 11,363
  • 29
  • 42
  • Thanks for your effort. But in `rpy2`, still, I am getting an error `RRuntimeError: Error in is.factor(x) : object 'Number of Steps' not found` – Md. Sabbir Ahmed Dec 28 '20 at 05:01
  • That's a different error than the one the question bid about. It tells you that there is no column with that name in your data frame. – lgautier Dec 28 '20 at 15:01
  • thanks for your response. I know that I should get such error only when this column is not available. However, despite having that column (checked by `df['Number of Steps']`), I found this error after keeping Number of Steps inside `` (i.e. `as.factor(Grave Accent symbol Number of Steps Grave Accent Symbol )`). – Md. Sabbir Ahmed Dec 28 '20 at 15:07
  • You are not providing a complete example, but I suspect that your data frame is not what you hope it is. I added a self-contained example to show that this is working. – lgautier Dec 29 '20 at 00:02
  • Thank you so much for your much effort. But I am sorry to say you that it does not work in my case. I have tried by converting pandas dataframe to dataframe of R, by using this statement `pandas2ri.py2ri(df_data)`. But still, I get same error. – Md. Sabbir Ahmed Dec 29 '20 at 03:10
  • It is much easier to try identify the source of an issue with a self-contained example. If the example I provide in my answer runs on your system, then I really think that your `df_data` is not what you think it is. – lgautier Dec 29 '20 at 15:19