1

I am working with the R programming language.

I have a column of data that looks something like this:

string = c("a1 123-456-7899 hh", "b 124-123-9999 b3")

I would like to remove the "phone numbers" so that the final result looks like this:

[1] "a1 hh" "b  b3"

I tried to apply the answer provided here Regular expression to match standard 10 digit phone number to my question:

gsub("^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$", "", string, fixed = TRUE)

But I get the following error: Error: '\+' is an unrecognized escape in character string starting ""^(\+"

Can someone please show me how to fix this?

Thanks!

stats_noob
  • 5,401
  • 4
  • 27
  • 83

1 Answers1

2

Try:

library(stringr)

s <- c("a1 123-456-7899 hh", "b 124-123-9999 b3")
result <- str_replace(s, "\\d+[-]\\d+[-]\\d+\\s", "")
print(result)

OUTPUT:

[1] "a1 hh" "b b3" 

This will look for :

  • \\d+ : one or more digits, followed by
  • [-] : a hyphen, followed by
  • \\d+ : one or more digits, followed by
  • [-] : a hyphen, followed by
  • \\d+ : one or more digits, followed by
  • \\s : a space

And replace it with "" - nothing

ScottC
  • 3,941
  • 1
  • 6
  • 20