2

I've been researching for a way to do a find and replace for foreign words in BBEdit but I'm having an issue with it. After research I ran across Regex - What would be regex for matching foreign characters? which led me to Regular-Expressions.info and a text block indicated:

Matching a single grapheme, whether it's encoded as a single code point, or as multiple code points using combining marks, is easy in Perl, PCRE, PHP, Ruby 2.0, and the Just Great Software applications: simply use \X.

and when I have a word (yes this is made up for testing) ōallaōallaēēalla I cannot use [A-Za-z]* for the entire word instead it works in segments and the only solution I've been able to come up with is something like ([A-Za-z]*\X{1,10}). Is there an alternative approach that wouldn't be too greedy and would pull the entire word instead of pulling it in segments?

Community
  • 1
  • 1
DᴀʀᴛʜVᴀᴅᴇʀ
  • 7,681
  • 17
  • 73
  • 127

1 Answers1

0

You could use the word boundary \b to match everything between to boundaries. That might not get everything, but for your contrived example it works.

/\b(.+)\b/

If you also want words at the beginning of a line, you need to include those.

/(?:\b|^)(.+)\b/

Try it at regex101.com. I cannot test if this works in your BBEdit though.

simbabque
  • 53,749
  • 8
  • 73
  • 136