I am creating a Jupyter notebook to clean a large amount of novels with regex code I am testing in Sublime. A lot of my texts contain the phrase 'digitized by Google' because that is where I got the PDF that I ran through Optical Character Recognition from. I want to remove all sentences that contain the phrase 'Digitized', or rather 'gitized' since the first part isn't always correctly transcribed.
When I use this phrase in Sublimes 'replace function', I get exactly the results I want:
^.*igitized.*$
However, when I try to use the re.sub method in my Jupyter notebook, which works from some other phrases, the 'Digitized by Google' lines are NOT correctly identified and replaced by 'nothing'.
text = re.sub(r'^.*igitized.*$', '', text)
What am I missing?