I want to lemmatize a corpus of 24 .txt files (dir1
) and save them in dir2
. I want to write the output of print(word.lemma_, end=" ")
in dir2
. This is the code I have so far:
import os
import re
import nltk
import spacy
import sys
sp = spacy.load('en_core_web_sm')
>>> dir1=("corpus")
>>> dir2=("corpus2")
for txt in os.listdir(dir1):
file=open(dir1+"/"+txt, "r", encoding="utf-8")
for line in file:
sentence=sp(line)
for word in sentence:
print(word.lemma_, end=" ")