Instead of defining documents
like this ...
documents = ["the mayor of new york was there", "machine learning can be useful sometimes","new york mayor was present"]
... I want to read the same three sentences from two different txt files with the first sentence in the first file, and sentence 2 and 3 in the second file.
I have come up with this code:
# read txt documents
os.chdir('text_data')
documents = []
for file in glob.glob("*.txt"): # read all txt files in working directory
file_content = open(file, "r")
lines = file_content.read().splitlines()
for line in lines:
documents.append(line)
But the documents
resulting from the two strategies seem to be in different format. I want the second strategy to produce the same output as the first.