1

I have a dataframe named "exposure" in long format. It has a CRP protein variable named "CRP" and 100 metabolites named as 1_T1, 2_T1, 3_T1, 4_T1 etc.

When I tried to do a simple linear regression with lm:

lm(1_T1 ~ CRP, data = exposure)

r returned Error: unexpected input in "lm(1_"

When I changed the metabolite names in the dataframe starting from alphabet such as m1_T1, m2_T2 etc, it works.

My question is: Is there a way calling variables named with number that will be treated correctly in the model formula in R, without resolving by changing the variable names?

I tried to use "1_T1" but it didn't work.

Thanks!

Jake

Jake
  • 11
  • 1
  • 4
    You could try backticks, but R has rules about what are syntactically valid names (all languages do). Probably best to just work within that. – joran Oct 11 '16 at 17:19
  • Thanks Joran! From your hints, I found this page (http://stackoverflow.com/questions/36220823/what-do-backticks-do-in-r) that helps to explain what's the identifier requirement in R. Naming variables with numeric starting is just a bad habit. I will rename them instead – Jake Oct 11 '16 at 17:58
  • If `i` and `j` are the response and independent variable column names or column numbers in `exposure` then lm( exposure[ c(i, j) ] )` will work. No formula is needed in that case. – G. Grothendieck Oct 11 '16 at 19:17

0 Answers0