I have a regular expression in my PHP script like this:
/(\b$term|$term\b)(?!([^<]+)?>)/iu
This matches the word contained in $term, as long as there's a word boundary before or after and it's not inside a HTML tag.
However, this doesn't work in non-ASCII cases, for example with Russian text. Is there a way to make it work?
I can get almost as good result with
/(\s$term|$term\s)(?!([^<]+)?>)/iu
but this is obviously more limited and since this regexp is about highlighting search terms, it has the problem of including the space in the highlight.
I've read this StackOverflow question about the problem, but it doesn't help - doesn't work correctly. In that example the captures are the other way around (capture text outside the search term, when I need to capture the search term).
Any way to make this work? Thanks!