2

There is sku name in below dataframe, I want to remove the part which start with 'V' and end with 'b', my code str_remove_all(sku_name,"^(V).*?(\\b)$") can't work.

Anyone can help?

mydata <- data.frame(sku_name=c('wk0001 V1b','123780 PRO V326b','ttttt V321b'))
mydata %>% mutate(sku_name_new=str_remove_all(sku_name,"^(V).*?(\\b)$"))
zx8754
  • 52,746
  • 12
  • 114
  • 209
anderwyang
  • 1,801
  • 4
  • 18

3 Answers3

5
vec <- c('wk0001 V1b','123780 PRO V326b','ttttt V321b')
sub("V.*b$", "", vec)
# [1] "wk0001 "     "123780 PRO " "ttttt "     
stringr::str_remove(vec, "V.*b$")
# [1] "wk0001 "     "123780 PRO " "ttttt "     

This also works with the non-greedy "V.*?b$", over to you if that's necessary.

BTW: \\b is a word-boundary, not the literal b. (V) is saving it as a group, that's not necessary (and looks a little confusing). The real culprit is that you included ^, which means start of string (as you mentioned), which will only match if all strings start with V, and in "Vsomethingb". The current vec strings start with "w", "1", and "t", none of them start with V.

If you need a guide for regex, https://stackoverflow.com/a/22944075/3358272 is a good guide of many components (and links to questions/answers about them).

r2evans
  • 141,215
  • 6
  • 77
  • 149
1

You can do it with this pattern:

vector <- c('wk0001 V1b','123780 PRO V326b','ttttt V321b')

# if only numbers can be between the "V" and "b".
stringr::str_remove(vector , "V\\d+b")

# if any character can be between the "V" and "b", but at least one and no "V" or "b".
stringr::str_remove(vector , "V[^Vb]+b")
Alvaro Morales
  • 1,845
  • 3
  • 12
  • 21
0

You were actually really close.

Fix the regex using one alternative mentioned by @2evans and it's done !

I share the code using dplyr pipe lines because it can be better for you.

mydata <- data.frame(sku_name=c('wk0001 V1b','123780 PRO V326b','ttttt V321b'))

mydata %>% mutate(sku_name_new=str_remove_all(sku_name,"V.*b$"))

 sku_name sku_name_new
1       wk0001 V1b      wk0001 
2 123780 PRO V326b  123780 PRO 
3      ttttt V321b       ttttt 


AugtPelle
  • 549
  • 1
  • 10