Happens to me a rare thing when trying to do a search with regex trough a pyperclip.paste()
if the search expression involves a \n
new line character.
Excuse my English.
When the search, I make it trough this triple quote assigned to a text
variable:
import re
text = '''
This as the line 1
This as the line 2
'''
pattern = re.compile(r'\d\n\w+')
result = pattern.findall(text)
print(result)
It actually prints the new line character \n
. Which is what I want, or almost what I expect.
»»» ['1\nThis']
But the problem starts when the string to search come from a text copied from the clipboard.
This as the line 1
This as the line 2
Say I just select and copy to clipboard that text and i want regex to extract the same previous output from it. This time I need to use pyperclip module.
So, forgetting the previous code and write this instead:
import re, pyperclip
text = pyperclip.paste()
pattern = re.compile(r'\d\n\w+')
result = pattern.findall(text)
print(result)
This is the result:
»»» [ ]
Nothing but two brackets. I discover (in my inexperience) that the problem causing this is the \n
character. And it has nothing to do with a conflict between the python (also \n character), because we avoid that with 'r'.
I already found a not too clearly solution for this (for me almost, because I'm just with the basics of Python right now).
import re, pyperclip
text = pyperclip.paste()
lines = text.split('\n')
spam = ''
for i in lines:
spam = spam + i
pattern = re.compile(r'\d\r\w+')
result = pattern.findall(spam)
print(result)
Note that instead of \n
for detect new lines in the last regex expression, I opted to \r
(\n
would cause the same bad behavior printing only brackets).
\r
its exchangeable with \s
, the output works, but:
»»» ['1\rThis']
With \r
instead of \n
At least it was a little victory for me.
It'll helps me a lot if you could explain to me a better solution for this o almost understand why this happened. You also can recommend me some concepts to investigate to, for a fully comprehension of this.