25

In R you can use the strsplit function to split a vector on a delimiter(split) as follows:

x <- "What is this?  It's an onion.  What! That's| Well Crazy."
unlist(strsplit(x, "[\\?\\.\\!\\|]", perl=TRUE))

## [1] "What is this"    "  It's an onion" "  What"          " That's"        
## [5] " Well Crazy"

I'd like to keep the delimiter(split) using R. So the desired output would be:

## [1] "What is this?"    "  It's an onion." "  What!"          " That's|"        
## [5] " Well Crazy."
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519

1 Answers1

26

You can use "(?<=DELIMITERS)":

unlist(strsplit(x, "(?<=[?.!|])", perl=TRUE))

## [1] "What is this?"    "  It's an onion." "  What!"          " That's|"        
## [5] " Well Crazy.
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 3
    You don't need all of the backslashes. `unlist(strsplit(x, "(?<=[?.!|])", perl=TRUE))` returns the same result – Jake Burkhead Feb 01 '14 at 01:49
  • Would love an option where the split is a sequence (e.g. "[0-9]+") rather than a single character... – dsz May 04 '22 at 23:02