I want to save word frequency lists as .CSV for several corpora. Is there a way to make Python write the filenames automatically based on the variable name? (e.g.: corpus_a > corpus_a_typefrequency.csv)
I have the following code, which already works for individual corpora:
from collections import Counter
import csv
counts = Counter(corpus_a)
counts = dict(sorted(counts.items(), key=lambda item: item[1],reverse=True))
with open('corpus_a_typefrequency.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in counts.items():
writer.writerow([key, value])
PS: it would be great if I could count only words (no punctuation) and also in a case-insensitive way. I haven't figured out how to do that here yet. I'm using data from the Brown Corpus as following:
import nltk
from nltk.corpus import brown
corpus_a = brown.words()
I tried brown.words().lower().isalpha()
, but that doesn't work.