0

I'm trying to remove rows in a dataframe that do not contain . or _

Basically, keep rows that have . or _

I've tried this with no luck: test = filter(test, grepl("_|.",V1)) where V1 is the name of the column

test = filter(test, grepl("_|.",V1))

For example, from "test", "test.com", "test_com", I'd like to keep "test.com" and "test_com" only.

Misha
  • 163
  • 1
  • 1
  • 12

1 Answers1

0

. has a special meaning in regex, you need to escape it in grepl, Try

dplyr::filter(test, grepl("_|\\.",V1))

#         V1
#1  test.com
#2  test_com
#3 test.com.

To remove rows that end with . we can use

dplyr::filter(test, !grepl("\\.$", V1))

#        V1
#1     test
#2 test.com
#3 test_com

Or in base R

subset(test, !endsWith(V1, "."))

data

test <- data.frame(V1 = c("test", "test.com", "test_com", "test.com."),
        stringsAsFactors = FALSE)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213