We just learned about using regular expression in my first python course (extremely new to programming), and one of the homework problems that I am struggling with requires us to use regular expression to find all the words of length n or longer, and then use that regular expression to find the longest word used from a text file.
I have no problem when I want to test out a specific length, but it returns an empty list when I use an arbitrary variable n:
import re
with open('shakespeare.txt') as file:
shakespeare = file.read()
n = 10 #if I take this out and put an actual number in the curly bracket below, it works just fine.
words = re.findall('^[A-Za-z\'\-]{n,}', shakespeare, re.M)
print(words)
len(words)
I'm not sure what I did wrong and how to resolve this. Any help is greatly appreciated!
For more context... To find the longest word, I used:
#for word with special characters such as '-' and '''
longest_word = max(re.findall('\S+', shakespeare, re.M), key = len)
#for word without special characters:
longest_pure_word = max(re.findall('[A-Za-z]+ ', shakespeare, re.M), key = len)
output1(special char): tragical-comical-historical-pastoral
output2(pure word): honorificabilitudinitatibus
I didn't use n because I couldn't get the first part of the question to work.