To open all the files from a directory and pass each of them to SEPERATE lists

Question

This is my current code.

my_file = open("/content/txts/txt1.txt", "r")
data = my_file.read()
l1 = clean_data(data)

my_file = open("/content/txts/txt2.txt", "r") 
data = my_file.read() 
l2 = clean_data(data)

my_file = open("/content/txts/txt3.txt", "r") 
data = my_file.read() 
l3 = clean_data(data)

my_file = open("/content/txts/txt4.txt", "r") 
data = my_file.read() 
l4 = clean_data(data)

But I dont want to apply the same functions over and over again. To create seperate lists for each of my txt file, I have tried an alternative:

import os
pathToFolder = '/content/txts'
fileList = os.listdir(pathToFolder)
dataDict = {}
for i in range(len(fileList)-1):
   with open(fileList[i],"r") as f:
    data = f.read()
    dataDict['l' + str(i)] = clean_data(data)
    f.close()

This is my txts folder

You need to include the folder path with the file names as well to build out the whole file path. See updated code below. — Mitchell Leefers, Feb 10 '23 at 02:17

Mitchell Leefers · Accepted Answer · 2023-02-10T02:17:53.413

0

Try to use os.listdir() to list the files in the folder than loop through the list. You would need to save which file you left off on so you could start in the same place. The following code would loop from whatever index you left off at in the folder to the end and save all the clean dataset into a dictionary with the keys being names similar to what you used in your question.

import os
fileList = os.listdir(pathToFolder)
dataDict = {}
for i in range(WhereYouLeftOffAt,len(fileList)-1):
   open(pathToFolder + '/' + fileList[i],"r") as f
   data = f.read()
   dataDict['l' + str(i)] = clean_data(data)
   f.close()

edited Feb 10 '23 at 02:17

answered Feb 08 '23 at 19:38

Mitchell Leefers

170
1
13

1

Or keep a set of files processed and pickle it so that each time the program is run it can read the pickle and exclude those files from further processing. – wwii Feb 08 '23 at 19:52

To open all the files from a directory and pass each of them to SEPERATE lists

1 Answers1