-1

I have a dataframe which contains

"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE||SOA_BLOCK-FR||SOA_BLOCK-IT||SOA_BLOCK-ES|"

I want the result to be -

"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE|SOA_BLOCK-FR|SOA_BLOCK-IT|SOA_BLOCK-ES|"

I tried:

leadtemp$collate = gsub("||","|",leadtemp$collate) 

but it is not working.

Please help me replace "||" with "|"

eli-k
  • 10,898
  • 11
  • 40
  • 44

3 Answers3

1

As MrFlick suggested, include fixed = TRUE in your gsub statement. The problem occurs because "|" is a Regular Expression operator. Using fixed = TRUE tells gsub to assume the pattern is a string and not a RegEx.

leadtemp$collate = gsub("||","|",leadtemp$collate, fixed=TRUE)

Another (although more complicated) way of doing it would be to escape all the |s:

leadtemp$collate = gsub("\\|\\|","\\|",leadtemp$collate)
RogB
  • 441
  • 1
  • 4
  • 14
0

| is a metacharacter. As you can read here, metacharacters need to be escaped out of with a \. \ is also a metacharacter so it must be escaped out of in the same way. So whenever you want to refer to a | in a string, you have to put \\|. This should make your code work:

leadtemp$collate = gsub("\\|\\|","\\|",leadtemp$collate)
Val
  • 6,585
  • 5
  • 22
  • 52
0

Try:

gsub("[|]{2}", "|", leadtemp$collate)

I have defined character class comprising the pipe character and forced gsub to look for exactly two occurrences.Result is:

"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE|SOA_BLOCK-FR|SOA_BLOCK-IT|SOA_BLOCK-ES|"
Sun Bee
  • 1,595
  • 15
  • 22