Please, please, please help. I have a folder filled with text files that I want to use NLTK to analyze. How do I import that as a corpus and then run NLTK commands on it? I've put together the code below but it's giving me this error:
raise error, v # invalid expression
sre_constants.error: nothing to repeat
Code:
import nltk
import re
from nltk.corpus.reader.plaintext import PlaintextCorpusReader
corpus_root = '/Users/jt/Documents/Python/CRspeeches'
speeches = PlaintextCorpusReader(corpus_root, '*.txt')
print "Finished importing corpus"
words = FreqDist()
for sentence in speeches.sents():
for word in sentence:
words.inc(word.lower())
print words["he"]
print words.freq("he")