How to load multiple text files from a folder into a python list variable

Question

I have a folder full of text documents, the text of which needs to be loaded into a single list variable.

Each index of the list, should be the full text of each document.

So far I have this code, but it is not working as well.

dir = os.path.join(current_working_directory, 'FolderName')
file_list = glob.glob(dir + '/*.txt')
corpus = [] #-->my list variable
for file_path in file_list:
    text_file = open(file_path, 'r')
    corpus.append(text_file.readlines()) 
    text_file.close()

Is there a better way to do this?

Edit: Replaced the csv reading function (read_csv) with text reading function (readlines()).

Martin Evans · Accepted Answer · 2021-01-04T20:45:26.847

13

You just need to read() each file in and append it to your corpus list as follows:

import glob
import os

file_list = glob.glob(os.path.join(os.getcwd(), "FolderName", "*.txt"))

corpus = []

for file_path in file_list:
    with open(file_path) as f_input:
        corpus.append(f_input.read())

print(corpus)

Each list entry would then be the entire contents of each text file. Note, using readlines() would give you a list of lines for each file rather than the raw text.

With a list-comprehension

file_list = glob.glob(os.path.join(os.getcwd(), "FolderName", "*.txt"))

corpus = [open(file).read() for file in file_list]

This approach though might end up with more resource usage as there is no with section to automatically close each file.

edited Jan 04 '21 at 20:45

answered Feb 23 '17 at 07:54

Martin Evans

45,791
17
81
97

If you define a function eg. `def get_file_text()` that uses a with section, you can then use that in your list comprehension so that the files are still closed. – Luke Nelson Nov 29 '21 at 04:12
Indeed you correct, I was though trying to emphasis why a one line approach might have disadvantages – Martin Evans Nov 29 '21 at 16:02

Trenton McKinney · Answer 2 · 2021-11-29T03:52:39.890

Solve this with the pathlib module, which treats paths as objects with methods.
Use Path() to create a pathlib object of the path (or use .cwd()), and use .glob (or .rglob()) to find the files matching the specific pattern.
- files = (Path().cwd() / 'FolderName').glob('*.txt')
  - / is used to add folders (extend) to a pathlib object.
- Alternatives:
  - files = Path('./FolderName').glob('*.txt')
  - files = Path('e:/PythonProjects/stack_overflow/t-files/').glob('*.txt')
Path.read_text() can be used to read the text into a list, without using .open(). The file is opened and then closed.
- text = [f.read_text() for f in files]
- Alternatives:
  - text = [f.open().read() for f in files]
  - text = [f.open().readlines() for f in files] - creates a list of lists of text.

from pathlib import Path

# get the files
files = (Path().cwd() / 'FolderName').glob('*.txt')

# write the text from each file into a list with a list comprehension - the file is opened and closed
text = [f.read_text() for f in files]

`for-loop` Alternative

Option 1

files = Path('./FolderName').glob('*.txt')

text = list()

for file in files:
    text.append(file.read_text())  # the file is opened and closed

Option 2

Path.open() with .read() can be used to open, and read the file text into a list, and close the file.

files = Path('./FolderName').glob('*.txt')

text = list()

for file in files:
    with file.open() as f:
        text.append(f.read())

Also see SO: How to open every file in a folder

score -1 · Answer 3 · answered Jun 03 '20 at 12:23

I find this to be an easier way:

    import glob


    corpus = []

    file_list = glob.glob("Foldername/*.txt")
    for file_path in file_list:
        with open(file_path, 'r') as file_input:
           corpus.append(file_input.read())
    print (corpus)

score -3 · Answer 4 · edited Dec 17 '18 at 08:35

-3

import os
import shutil
import csv
import sys

csv_file = "card.csv"

with open(csv_file, 'r') as f:
    reader = csv.reader(f)
    for i, row in enumerate(reader):
        if i == 0:
            print(i)
            pass    # Skip header row
        else:
            filename,filepath,x,y,w,h = row

            file2 = filename + ".txt"    
            file1 = open(file2,"a")#append mode 
            file1.write("%s\n%s\n%s\n%s\n" % (x, y, w,h)) 
            file1.close()

edited Dec 17 '18 at 08:35

Nam G VU

33,193
69
233
372

answered Dec 17 '18 at 08:04

arslan

1

Add some eplanation to your answer – executable Dec 17 '18 at 08:07

How to load multiple text files from a folder into a python list variable

4 Answers4

With a list-comprehension

for-loop Alternative

Option 1

Option 2

`for-loop` Alternative