0

I would like to keep all the text that occurs before the last space in a string.

For Example:

x<-c("New England Patriots","Carolina Panthers")

I have tried str_extract(x,"\\w+") but I get back New, Carolina.

I would like New England, Carolina

Bryan Adams
  • 174
  • 1
  • 12

1 Answers1

3

Try this:

library(stringr)
str_match(c("New England Patriots","Carolina Panthers"), "(^.+)\\s")[, 2]

[1] "New England" "Carolina"   

This performs a "greedy match". The regex says "everything from the start of the line to the last space".

I used str_match instead of str_extract to avoid returning the last space itself by returning only the match in parentheses. You could use str_extract and then trim the space using e.g. trimws.

neilfws
  • 32,751
  • 5
  • 50
  • 63
  • I know that the \\s is any white space, but could you explain the (^.+) or point me to an article that explains the options. – Bryan Adams Sep 06 '18 at 12:19
  • 1
    `^` = start of line. `.` = "anything". `+` = "keep matching anything". `()` = store this part of the match as a variable (returned using [, 2]). [More information](https://cran.r-project.org/web/packages/stringr/vignettes/regular-expressions.html). – neilfws Sep 06 '18 at 22:02