1

Edit: a near-duplicate
How to reference column names that start with a number, in data.table
The post above regards data.table. The problem is similar but technically package-dependent, but the solution is the same.

Start of original post

I'm unable to figure out how to filter the following column with dplyr::filter

set.seed(1)
library(dplyr)
df <- as.data.frame(matrix(sample(c(TRUE, FALSE), 10, replace=TRUE), ncol=1)) %>%
        setNames(c(paste0("1", letters[1])))
      # 1a
# 1   TRUE
# 2   TRUE
# 3  FALSE
# 4  FALSE
# 5   TRUE
# 6  FALSE
# 7  FALSE
# 8  FALSE
# 9  FALSE
# 10  TRUE

df[df$"1a"==TRUE,]
# [1] TRUE TRUE TRUE TRUE

df %>% dplyr::filter("1a"==TRUE)
# [1] 1a
# <0 rows> (or 0-length row.names)
CPak
  • 13,260
  • 3
  • 30
  • 48
  • 3
    Use `df %>% dplyr::filter(\`1a\` == TRUE)` – pogibas Dec 29 '17 at 17:09
  • It doesn't really matter that it's a column name. Any non-standard variable (or column) name can be used with backticks. `\`1a\` = 4; \`1a\` + 1` – Gregor Thomas Dec 29 '17 at 17:11
  • Appreciate the help! If you'd like to post as an answer, I'll accept – CPak Dec 29 '17 at 17:12
  • Different question because it's about `data.table`, but the backticks are also covered in [How to reference column names that start with a number, in `data.table`?](https://stackoverflow.com/q/15637132/903061) (also probably good to link to such a closely related question. – Gregor Thomas Dec 29 '17 at 17:16
  • @Gregor yes, definitely a duplicate. Added link to my post – CPak Dec 29 '17 at 17:23
  • See also `?Quotes`. The 'Names and Identifiers' section is especially relevant here: "other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick" – Henrik Dec 29 '17 at 17:26
  • I would say *related* but not a duplicate. The other one is `data.table`-specific, and this one is `dplyr`-specific. I think this one should get an answer and remain separate but linked. There are definitely people who will search for `dplyr` or `data.table`-specific answers for this and be confused if they don't find them. – Gregor Thomas Dec 29 '17 at 17:29
  • _Related_ on `select`ing syntactically invalid names: [dplyr: select column names containing white space](https://stackoverflow.com/questions/22842232/dplyr-select-column-names-containing-white-space) – Henrik Dec 29 '17 at 17:37

1 Answers1

2

You can use backticks to refer to variables with non-standard names. This works whether they are columns of a data frame or not.

For this specific case

df %>% dplyr::filter(`1a`)  # note that == TRUE is never needed

Or generally,

`2b` = 1:5
mean(`2b`)
# [1] 3

Of course you shouldn't make a bad habit of this - use standard names whenever possible.


As mentioned in comments, the ?Quotes documentation is helpful. It states (in the Names and Identifiers section):

Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (`), and deparse will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: see formula.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294