68
df <- structure(list(`a a` = 1:3, `a b` = 2:4), .Names = c("a a", "a b"
), row.names = c(NA, -3L), class = "data.frame")

and the data looks like

  a a a b
1   1   2
2   2   3
3   3   4

Following call to select

select(df, 'a a')

gives

Error in abs(ind[ind < 0]) : 
  non-numeric argument to mathematical function

How can I select "a a" and/or rename it to something without space using select? I know the following approaches:

  1. names(df)[1] <- "a"
  2. select(df, a=1)
  3. select(df, ends_with("a"))

but if I am working on a large data set, how can I get an exact match without knowing the index numer or similar column names?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Flux
  • 815
  • 1
  • 6
  • 6

3 Answers3

84

You may select the variable by using backticks `.

select(df, `a a`)
#   a a
# 1   1
# 2   2
# 3   3

However, if your main objective is to rename the column, you may use rename in plyr package, in which you can use both "" and ``.

rename(df, replace = c("a a" = "a"))
rename(df, replace = c(`a a` = "a"))

Or in base R:

names(df)[names(df) == "a a"] <- "a"

For a more thorough description on the use of various quotes, see ?Quotes. The 'Names and Identifiers' section is especially relevant here:

other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick".

See also ?make.names about valid names.

See also this post about renaming in dplyr

micstr
  • 5,080
  • 8
  • 48
  • 76
Henrik
  • 65,555
  • 14
  • 143
  • 159
  • 2
    you can also do the same with `select`: `select(df, a=\`a a\`)` – Arun Apr 03 '14 at 16:03
  • @Arun, Thanks for your suggestion. But doesn't this both rename "a a", _and_ select this variable only (in contrast to `rename`)? – Henrik Apr 03 '14 at 16:10
  • Henrik, you're right. But `rename` will copy the entire data.frame just to rename the columns. So, I'd not use it / consider it efficient. I'm not sure if there's a way like `setattr` in `data.table`. Ex: `setattr(df, 'names', c("a", "b"))` renames here by reference. – Arun Apr 03 '14 at 16:13
4

Some alternatives to backticks, good as of dplyr 0.5.0, the current version as of this writing.

If you're trying to programmatically select an argument as a column and you don't want to rename or do something like paste/sprintf the column name into backticks, you can use as.name in conjunction with the non-standard evaluation version of select, which is select_:

dplyr::select_(df, as.name("a a"))

Many of the dplyr functions have non-standard versions. In the case of select specifically, you can also use the standard version in conjunction with the select helper one_of. See ?dplyr::select_helpers for documentation:

dplyr::select(df, dplyr::one_of("a a"))
Andy
  • 354
  • 2
  • 4
  • This is incorrect. Even dplyr's NSE version doesn't handle it. For example: `colnames(mtcars)[1] <- "Miles Per Gallon"` `mtcars %>% select_("Miles Per Gallon")` This will return an error. – krthkskmr Jun 08 '17 at 04:12
  • 4
    `mtcars %>% select_(as.name("Miles Per Gallon"))` works. – Andy Jun 09 '17 at 18:45
0

As of 2023, the code which previously gave the error now runs:

> select(df, 'a a')
  a a
1   1
2   2
3   3

So a legitimate answer to the question 'How to deal with nonstandard column names?' is now, write them as strings (for select this works out of the box, but for mutate, you need to use something like mutate(df, a = .data[['a a']])

Mark
  • 7,785
  • 2
  • 14
  • 34