1

I got the confidence levels per variable in linear regression.I wanted to use the results for sorting variables so I kept the result set as a data frame. However when I tried to do an str() function on one of the variables I got an error (written below).How can I store the result data set so I'll be able to work on it?

df <- read.table(text = "target birds    wolfs     
                         1        9         7 
                         1        8         4 
                         0        2         8 
                         1        2         3                                               3 
                         0        1         2 
                         1        7         1 
                         0        1         5 
                         1        9         7 
                         1        8         7 
                         0        2         7 
                         0        2         3 
                         1        6         3 
                         0        1         1 
                         0        3         9 
                         0        1         1  ",header = TRUE)
model<-lm(target~birds+wolfs,data=df)
confint(model)
                  2.5 %     97.5 %
(Intercept) -0.23133823 0.36256052
birds        0.10102771 0.18768505
wolfs       -0.09698902 0.00812353
s<-as.data.frame(confint(model))
str(s$2.5%)
Error: unexpected numeric constant in "str(s$2.5"
mql4beginner
  • 2,193
  • 5
  • 34
  • 73
  • 4
    You need to add backticks, otherwise you are telling R to evaluate an expression. You also *must* add a space between `2.5` and `%` in order to match the exact column name. Try ```str(s$`2.5 %`)```. Other than that, I would advise you to convert your column names to a proper column names using `names(s) <- make.names(names(s))` – David Arenburg Aug 16 '15 at 08:37
  • 1
    I strongly recommend "broom" package whenever you want to create a data frame from a model's output. Check this: http://finzi.psych.upenn.edu/library/broom/html/confint_tidy.html . – AntoniosK Aug 16 '15 at 08:55
  • Thanks AntoniosK, I'll have a look at it. – mql4beginner Aug 16 '15 at 09:07
  • I just found another solution: colnames(s)[1:2] <- c('lwr','upr') – mql4beginner Aug 16 '15 at 09:08
  • 3
    btw, if you will use `data.frame` instead of `as.data.frame`, `make.names` will be automatically applied. Try `s <- data.frame(confint(model))` – David Arenburg Aug 16 '15 at 11:34
  • **Not a duplicate**. Reopen this, please. I also disagree with the `make.names` advice, but that’s incidental. @DavidArenburg If you recommend `make.name`, do so in an answer, please don’t *mandate* this as the only true answer by linking to an inappropriate duplicate and closing the question. – Konrad Rudolph Aug 16 '15 at 11:37
  • 1
    @KonradRudolph why using `make.names` is inappropriate? There is also an example in the answers on how to use backticks in order to call inappropriate name in R. When I suggested this to the OP he seemed to be satisfied. I didn't mandate anything. Every dupe could have many additional answers, so you suggest we won't close anything as a dupe? – David Arenburg Aug 16 '15 at 11:42
  • 1
    @DavidArenburg It’s not inappropriate, it’s just not the only, nor the best, solution. The other question is completely different. The answers happen to be applicable as well but for somebody reading the questions there really isn’t a lot of similarity. Closing as duplicate is for *duplicates*, not vaguely related questions. The answer I want to write here wouldn’t make sense for the other question, hence I can’t post it there. – Konrad Rudolph Aug 16 '15 at 11:44
  • @KonradRudolph The question here is not *vaguely related* but rather almost the same but with a different title. If you see OPs solution to his own question was to use backticks. I will reopen, but I don't see a point of answering a question that was asked already ~1MM times on SO. – David Arenburg Aug 16 '15 at 11:47
  • 1
    @David I think there’s a huge difference between a question whose answer is “Use `check.names = FALSE` (+ some details)”, and one where it’s “use backticks (+ some details)”, never mind that the details overlap. I agree that there are probably appropriate duplicates, but this is not it. – Konrad Rudolph Aug 16 '15 at 11:49

1 Answers1

1

The expression behind the $ operator must be a valid R identifier. 2.5% isn’t a valid R identifier, but there’s a simple way of making it one: put it into backticks: `2.5%`1. In addition, you need to pay attention that the column name matches exactly (or at least its prefix does). In other words, you need to add a space before the %:

str(s$`2.5 %`)

In general, a$b is the same as a[['b']] (with some subtleties; refer to the documentation). So you can also write:

str(s[['2.5 %']])

Alternatively, you could provide different column names for the data.frame that are valid identifiers, by just assigning different column names. Beware of make.names though: it makes your strings into valid R names, but at the cost of mangling them in ways that are not always obvious. Relying on it risks confusing readers of the code, because previously undeclared identifiers suddenly appear in the code. In the same vein, you should always specify check.names = FALSE with data.frame, otherwise R once again mangles your column names.


1 In fact, R also accepts single quotes here (s$'2.5 %'). However, I suggest you forget this immediately; it’s a historical accident of the R language, and treating identifiers and strings the same (especially since it’s done inconsistently) does more harm than good.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 2
    or `str(s[["2.5 %"]])` – Ben Bolker Aug 16 '15 at 11:58
  • 1
    I don't agree that "*you should always specify check.names = FALSE with data.frame*". I think you are *mandating* here some practices that mainly opinion based. There is a reason why this is the default behaviour of `data.frame` and for the function `make.names` itself. – David Arenburg Aug 16 '15 at 12:05
  • @DavidArenburg I maintain that these are *bad* reasons (unlike `stringsAsFactors=TRUE`, which has good historical reasons, although nowadays everybody *still* recommends the opposite). And yes, it’s an opinion. But it’s a technically qualified opinion, from extensive practical experience, not a subjective one à la “I like the colour blue”. – Konrad Rudolph Aug 16 '15 at 12:08
  • Not to mention that you practically converted my comment to your answer with an exception of providing your opinion against `make.names` while wording it as a fact rather an opinion. But I guess this the common practice on SO these days. – David Arenburg Aug 16 '15 at 12:12
  • Guys, Thank you both for the answers. I (and I'm sure many other) learned a lot from your debate. – mql4beginner Aug 16 '15 at 12:15
  • @David Yes, I would have preferred if you had provided the comment as an answer. But lacking that you can’t really complain that I posted the same technical solution in an answer. As for the rest: I strive to teach best practices. This implies discouraging bad practices. This is what teaching is about. I object to your claim that I’m confounding facts and opinions. We simply disagree about a best practice. I acknowledge that, but I won’t cheapen my opinion on the matter but qualifying it unduly. You can add your own answer, after all. – Konrad Rudolph Aug 16 '15 at 12:17