0

Here is simple example

library(dplyr)
library(stringr)

    > dataf <- data_frame(text = c('this is a pip||e |' ,
+                              'this is |||'))
> dataf
# A tibble: 2 x 1
                text
               <chr>
1 this is a pip||e |
2        this is |||

I want to replace all the pipes in the data with an empty string. Basically I want them to disappear. However, I am only able to get rid of one of them at a time:

     > dataf %>% mutate(text = str_replace(text, '\\|+', ""))
# A tibble: 2 x 1
              text
             <chr>
1 this is a pipe |
2         this is

What is wrong here? Thanks!

ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
  • 1
    in base R, `gsub("\\|", "", 'this is |||')` works as does `sub("\\|+", "", 'this is |||')`. – lmo Jun 13 '17 at 18:01
  • thanks but with the new `stringr` is there a way as well? – ℕʘʘḆḽḘ Jun 13 '17 at 18:02
  • 2
    Just confirmed that `str_replace(text, '\\|+', "")` will work. – lmo Jun 13 '17 at 18:03
  • 2
    https://stackoverflow.com/questions/4736/learning-regular-expressions – jogo Jun 13 '17 at 18:03
  • a ha! the gsub solution does not work with the updated example. – ℕʘʘḆḽḘ Jun 13 '17 at 18:07
  • 3
    `gsub` will work. maybe you meant the `sub` solution. That regex is designed to remove the first (set of) adjacent pipes. `gsub`, on the other hand will remove them all. You could even use `gsub("|", "", 'this is a pip||e |', fixed=TRUE)` to make it a bit more readable. – lmo Jun 13 '17 at 18:12

1 Answers1

3

You can use str_replace_all from stringr to remove all the matched patterns:

dataf %>% mutate(text = str_replace_all(text, '\\|', ""))

# A tibble: 2 × 1
#            text
#           <chr>
#1 this is a pipe
#2       this is 
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 1
    I prefer Imo's answer. Let's embrace regex. – M-- Jun 13 '17 at 18:05
  • its nice to have multiple solution to simple problems – ℕʘʘḆḽḘ Jun 13 '17 at 18:06
  • 2
    @Masoud Just like to point out there is a difference between `str_replace_all(, "\\|",..)` and `str_replace(,"\\+",..)`; Their difference is the same as the difference between `gsub` and `sub`; You can test this string with the two options "ab||cd|". – Psidom Jun 13 '17 at 18:09
  • 1
    @Psidom I agree. That was a personal preference for this specific problem. – M-- Jun 13 '17 at 18:14