1

I'm looking for how to handle strings with both regular and special characters with whitespace in between.

text = 'You got a Check +'

lookup_term = check +

replace_with_term = 'check+'

Final product I'm looking for is you got a check+

I am currently using text.downcase.gsub(/\blookup_term\b/, replace_with_term) to handle lookup_term with regular characters, but I can't seem to fingure out how to handle the combo of regular expression + whitespace + special character.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Hawkins
  • 21
  • 4

1 Answers1

0

You may use

text.downcase.gsub(/(?<!\w)#{Regexp.escape(lookup_term)}(?!\w)/, replace_with_term)

If you do not really want to get the string turned into lowercase, but just want case insensitive matching, use the /i modifier and remove .downcase:

text.gsub(/(?<!\w)#{Regexp.escape(lookup_term)}(?!\w)/i, replace_with_term)

See the Ruby demo online.

The resulting regex will look like /(?<!\w)check\ \+(?!\w)/, see Rubular demo.

Note that \b meaning is context-dependent, and matches an empty space (a location)

  1. Before the first character in the string, if the first character is a word character.
  2. After the last character in the string, if the last character is a word character.
  3. Between two characters in the string, where one is a word character and the other is not a word character.

The (?<!\w)check\ \+(?!\w) pattern contains unambiguous word boundaries, they always match the same way:

  • (?<!\w) - a left-hand word boundary, requires a start of string position or any non-word character immediately to the left of the current location
  • check\ \+ - check, space and +
  • (?!\w) - a right-hand word boundary, requires an end of string position or any non-word character immediately to the right of the current location.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you! This does solve the issue and `text.downcase.gsub(/(?<!\w)#{Regexp.escape(lookup_term)}(?!\w)/, replace_with_term)` returns `you got a check+` like I expected. (note: I do want it all lowercase, so using `.downcase` worked for me) – Hawkins Jul 12 '20 at 01:55