1

I am looking to use dplyr to select only variables in my dataset that end in "500" exactly. That is, I do not want variables ending in 2500, 5500, or 1500. Just 500. I tried using ends_with("500"), but this includes the aforementioned variables ending in 2500, etc that I want to exclude.

I'm sure this is a very thing to do, but I am having trouble finding exactly what I want via google search.

Thanks!

z_11122
  • 153
  • 8
  • 1
    No sample data, no sample code, please spend some time to make this question [reproducible](https://stackoverflow.com/q/5963269) (more refs: [mcve] and https://stackoverflow.com/tags/r/info). – r2evans Oct 21 '21 at 16:59
  • what do you mean with "ends in 500 exactly"? ' 2500' does end in `500`. – Cettt Oct 21 '21 at 17:00

1 Answers1

2

We may use matches instead of ends_with as ends_with does a fixed match and it will match the 500 or 1500 etc. Instead, if we use matches, there is flexibility in it i.e. can specify \\D+ before the 500 that it matches non-digits (not clear without a reproducible example though)

library(dplyr)
df1 %>%
     select(matches('\\D+500$'))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you. Could you explain a little more what the call to "matches" means? – z_11122 Oct 21 '21 at 17:14
  • @Cettt basically, I want to include things that end in 500 but not 2500. It is a measure of distance, so I only want variables measured at 500 meters, not 2500 meters. – z_11122 Oct 21 '21 at 17:15
  • @ZanePatterson The `\\D+` matches one or more characters that are not a digit, followed by 500 at the end (`$`) of the string i.e. column names – akrun Oct 21 '21 at 17:16
  • @ZanePatterson if you can tell us what characters are the prefix before the digits part, it would be great in formulating the `matches`. Here, I was just guessing that the non-numeric part followed by numeric part in column names – akrun Oct 21 '21 at 17:17
  • @ZanePatterson Do you have column names as `"500"` `"2500"` or `"hello_500"`, `"hello_2500"` – akrun Oct 21 '21 at 17:19
  • Yes, there are such variables, but there are others such as `hello_bye500` and `hello_bye2500` – z_11122 Oct 21 '21 at 17:37
  • @ZanePatterson doesn't matter – akrun Oct 21 '21 at 17:38
  • @ZanePatterson The code still works `df1 <- tibble(hello_500 = 1:3, hello_bye500 = 2:4, hello_2500 = 1:3, hello_bye2500 = 1:3)` – akrun Oct 21 '21 at 17:39
  • 1
    sorry, wasn't exactly sure what you were asking. I tested it and it seems to have worked perfectly. Thanks a lot! – z_11122 Oct 21 '21 at 17:40
  • @ZanePatterson its okay. The `\\D+` works in most cases except if you have some special characters. i.e. why I was asking for the pattern – akrun Oct 21 '21 at 17:42