-13

I have tried \bنيك\b but this does not seem to work.

Examples would be: نيك should MATCH

الهندسة الميكانيكية should NOT MATCH

نيك نيك نيك should MATCH

  • 1
    What are you asking? – Jordan Singer Feb 05 '19 at 17:20
  • the word when used alone is a bad word. i wanted to know how the regex would result a match if the word نيك was used alone. what is the regex match for نيك but not نيك combined with other words before and after. – Joe Tecson Feb 05 '19 at 17:21
  • 2
    look into word boundaries (`\b`) and UTF-8 capabilities (the `re.U` flag) – Aaron Feb 05 '19 at 17:23

2 Answers2

0

Reasoning about the RegExp pattern using English words was a more straightforward way for me to go about this.

I have used a pattern that combines negative lookahead assertion with positive lookahead assertion to exclude specific matching patterns.

You can replace hello with نيك in the given pattern below to test against words.

import re

pattern = re.compile(r"""
(?=(?![\w\s]+)) # don't match next group when word/space is before it
(hello[\s]?)+   # match hello with optional space repeatedly
|               # or
(hello[\s]?)+   # match hello with optional space repeatedly
(?=(?![\w\s]+)) # don't match previous group when word/space is after it
""", re.VERBOSE)

Testing cases

>>> case1 = 'hello'
>>> case2 = 'hello world'
>>> case3 = 'world hello'
>>> case4 = 'ehlo world hello'
>>> case5 = 'hello hello hello'

>>> re.match(pattern, case1)
<re.Match object; span=(0, 5), match='hello'>
>>> re.match(pattern, case2)

>>> re.match(pattern, case3)

>>> re.match(pattern, case4)

>>> re.match(pattern, case5)
<re.Match object; span=(0, 17), match='hello hello hello'>
Oluwafemi Sule
  • 36,144
  • 1
  • 56
  • 81
-2

If my understanding is right, just use str.count() module.