Your regex [A-Za-z_](?!%)
matches a single character in your character set that is not followed by %
. Due to which it will not print just a
and 3
that appear just before %
character and print rest every character in character set. But don't think you want that as your intended output is Biblioteca Nacional de Espa
You can use this regex,
(?<!%)([a-zA-Z]+)(?=.*%)
and find all matching inputs. Here is a sample python code,
import re
x=r'Biblioteca_Nacional_de_Espa%C3%B1a'
y=re.compile('(?<!%)([a-zA-Z]+)(?=.*%)')
tokens = y.findall(x)
print(' '.join(tokens))
It prints,
Biblioteca Nacional de Espa
In case you had a typo in your post and indeed wanted to capture Biblioteca_Nacional_de_Espa
, then you just have to retain underscore (that I removed) in your character set and the regex becomes,
(?<!%)([a-zA-Z_]+)(?=.*%)
And your python code becomes,
import re
x=r'Biblioteca_Nacional_de_Espa%C3%B1a'
y=re.compile('(?<!%)([a-zA-Z_]+)(?=.*%)')
tokens = y.findall(x)
print(' '.join(tokens))
which outputs,
Biblioteca_Nacional_de_Espa