regex stop searching after '%' is found

Question

import re
x=r'Biblioteca_Nacional_de_Espa%C3%B1a'
y=re.compile('[A-Za-z_](?!%)')
for i in y.findall(x):
    print(i,end='')

this is an example,i want the search to stop as soon as it finds % and print the previous words between them spaces in this example it should Biblioteca Nacional de Espa,i found this link Regex stop searching at specific string but it was too complicated,any help is appreciated

Possible duplicate of [Regex excluding specific characters](https://stackoverflow.com/questions/23352038/regex-excluding-specific-characters) — CertainPerformance, Nov 25 '18 at 08:38
Why not just clip the part from the first `%`, and only then get the words? `x.split("%", 1)[0].split()` — trincot, Nov 25 '18 at 08:41

score 1 · Accepted Answer · answered Nov 25 '18 at 09:57

Your regex [A-Za-z_](?!%) matches a single character in your character set that is not followed by %. Due to which it will not print just a and 3 that appear just before % character and print rest every character in character set. But don't think you want that as your intended output is Biblioteca Nacional de Espa

You can use this regex,

(?<!%)([a-zA-Z]+)(?=.*%)

and find all matching inputs. Here is a sample python code,

import re
x=r'Biblioteca_Nacional_de_Espa%C3%B1a'
y=re.compile('(?<!%)([a-zA-Z]+)(?=.*%)')
tokens = y.findall(x)
print(' '.join(tokens))

It prints,

Biblioteca Nacional de Espa

In case you had a typo in your post and indeed wanted to capture Biblioteca_Nacional_de_Espa, then you just have to retain underscore (that I removed) in your character set and the regex becomes,

(?<!%)([a-zA-Z_]+)(?=.*%)

And your python code becomes,

import re
x=r'Biblioteca_Nacional_de_Espa%C3%B1a'
y=re.compile('(?<!%)([a-zA-Z_]+)(?=.*%)')
tokens = y.findall(x)
print(' '.join(tokens))

which outputs,

Biblioteca_Nacional_de_Espa

regex stop searching after '%' is found

1 Answers1