1

Does anyone know how to split a string in R based on punctuation or how can I remove everything before punctuation, but not the punctuation?

x <- c("a>1", "b2<0", "yy01>10")

The following is the desired result:

"a", "b2", "yy01"

">1", "<0", ">10"

To get the first part I can do:

gsub("\\b\\d+\\b|[[:punct:]]", "", x)

"a"    "b2"   "yy01"

But I am not sure how to get the second one. Does anyone have an idea?

Thanks

  • Not a duplicate, but [this](https://stackoverflow.com/q/46884556/5325862) post might help you figure out lookahead / lookbehind patterns – camille Feb 15 '21 at 19:39

1 Answers1

1

using strsplit from base R with regex specified to split at the word boundary preceding the operators <>

do.call(cbind, strsplit(x, "\\b(?=[<>])", perl = TRUE))
#     [,1] [,2] [,3]  
#[1,] "a"  "b2" "yy01"
#[2,] ">1" "<0" ">10" 
akrun
  • 874,273
  • 37
  • 540
  • 662