8

I want to select columns from my tibble that end with the letter R AND do NOT start with a character string ("hc"). For instance, if I have a dataframe that looks like this:

name  hc_1  hc_2  hc_3r  hc_4r  lw_1r  lw_2  lw_3r  lw_4   
Joe   1     2     3      2      1      5     2      2
Barb  5     4     3      3      2      3     3      1

To do what I want, I've tried many options, but I'm surprised that this one doesn't work:

library(tidyverse)
data %>%
  select(ends_with("r"), !starts_with("hc"))

When I try it, I get this error:

Error: !starts_with("hc") must evaluate to column positions or names, not a logical vector

I've also tried using negate() and get the same error.

library(tidyverse)
data %>%
  select(ends_with("r"), negate(starts_with("hc")))

Error: negate(starts_with("hc")) must evaluate to column positions or names, not a function

I'd like to keep the answer within the dplyr select function because, once I select the variables, I'm going to end up reversing them by using mutate_at, so a tidy solution is best.

Thank you!

A. Suliman
  • 12,923
  • 5
  • 24
  • 37
J.Sabree
  • 2,280
  • 19
  • 48

2 Answers2

18

We can use - as the starts_with output is not a logical vector

library(dplyr)
data %>%
     select(ends_with("r"), -starts_with("hc"))
 #   lw_1r lw_3r
 #1     1     2
 #2     2     3

data

data <- structure(list(name = c("Joe", "Barb"), hc_1 = c(1L, 5L), hc_2 = c(2L, 
4L), hc_3r = c(3L, 3L), hc_4r = 2:3, lw_1r = 1:2, lw_2 = c(5L, 
3L), lw_3r = 2:3, lw_4 = 2:1), class = "data.frame", row.names = c(NA, 
-2L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thank you! But let's say that I had another set of columns that started with jw_1, jw_2r, etc. How would I then say I only want the columns that start with lw & end with r? Is there a way to link the requirements together? – J.Sabree Aug 30 '19 at 17:15
  • You can use another set of `-starts_with("jw")` if you are going by the `starts_with/end_with` route as this can only a single pattern. Or else you may have to use `matches` as in the other answer `matches("^lw.*r$")` – akrun Aug 30 '19 at 17:16
  • 1
    okay, thank you! I was hopeful that I could put an & statement in the select command, but i guess that's not possible. Thanks! – J.Sabree Aug 30 '19 at 17:21
6

If you need an advanced regular expression use matches

library(dplyr)
#Starts with any letter except h or c and ends with an r
df %>% select(matches('^[^hc].*r$'))
  lw_1r lw_3r
1     1     2
2     2     3
A. Suliman
  • 12,923
  • 5
  • 24
  • 37