0

I am trying to match text then any number using python re regular expression except when the text is certain words. e.g.

# import re
import re
# this match expression is intended to match any alphanumerical word followed by any number unless the first alphanumerical word ends with either germany or france. 
match = r'[a-zA-Z0-9]{1,}[\s]{1,}(?<!france)(?<!germany)[0-9]{1,}'

re.findall( match, 'alphanumerical1234text  12312442')
>>>['alphanumerical1234text  12312442'] # this is correct

re.findall( match, 'alphanumerical1234text germany 12312442')
>>> ['germany 12312442'] # this shouldn't return anything 

re.findall( match, 'alphanumerical1234textgermany 12312442')
>>>['alphanumerical1234textgermany 12312442'] # this shouldn't return anything

re.findall( match, 'alphanumerical1234text france 12312442')
>>>['france 12312442'] # this shouldn't return anything

re.findall( match, 'alphanumerical1234textfrance 12312442')
>>>['alphanumerical1234textfrance 12312442'] # this shouldn't return anything

any idea how to build this regular expression?

Cobry
  • 4,348
  • 8
  • 33
  • 49

1 Answers1

3

You would have to put the lookbehind before the space. …\s(?<!france) is equivalent to …\s, because anything that ends with a space cannot also end with an “e”.

r'[a-zA-Z0-9]+(?<!france)(?<!germany)\s+[0-9]+'
Ry-
  • 218,210
  • 55
  • 464
  • 476
  • @Cobry: Sorry, fixed the `|`. Moving the assertions behind the space doesn’t affect what’s captured, though. – Ry- Feb 10 '18 at 02:09
  • how would it work if i want the lookbehind to skip and white space r'[a-zA-Z0-9]+\s+(?<!france)(?<!germany)\s+[0-9]+' – Cobry Feb 10 '18 at 02:18
  • @Cobry: Why do you want to do that? Does this solution not work? – Ry- Feb 10 '18 at 02:22
  • it works, but actually the implementation is a little bit complex than that. the code i am building must build regular expression on demand to match some addresses with complex zipcodes and states matching with zipcodes ... – Cobry Feb 10 '18 at 02:26
  • 1
    @Cobry: So where does the problem come in? Maybe you can edit that into your question. – Ry- Feb 10 '18 at 02:29