0

I am dealing with a large set of data that can be classified to be written in one of many files. I am trying to open the files all at once so I can write to the files as I am going through the data (I am working with Python 3.7).

I could do multiple with open(...) as ... statements but I was wondering if there is a way to do this without having to write out the open statements for each file.

I was thinking about using a for loop to open the files but heard this is not exception safe and is bad practice.

So what do you think is the best way to open multiple files where the filenames are stored in a list?

Kyung Lee
  • 72
  • 1
  • 5
  • I would always use a for loop as you've said, not sure why it would be bad practice. – Sirsmorgasboard Feb 11 '19 at 19:21
  • "I was thinking about using a for loop to open the files but heard this is not exception safe and is bad practice." You heard incorrectly. – juanpa.arrivillaga Feb 11 '19 at 19:28
  • I heard it is better practice to use the context manager to open files because if you use open() and an exception is raised before the file is properly closed, it will lead to unclosed files. – Kyung Lee Feb 11 '19 at 19:28
  • @KyungLee you can use a context-manager in your for-loop. – juanpa.arrivillaga Feb 11 '19 at 19:28
  • See this: https://stackoverflow.com/questions/3024925/create-a-with-block-on-several-context-managers – juanpa.arrivillaga Feb 11 '19 at 19:31
  • Anyway, is this a process that is supposed to run indefinitely? Honestly, a resource leak won't be a huge deal if this is some script that runs a batch process and some may fail. The resources will be reclaimed by your OS once the process finishes. But regardless, you can still do a dynamic number of context-managers using the techniques in that link – juanpa.arrivillaga Feb 11 '19 at 19:33
  • Thanks for the help. Honestly I don't think it would matter for the script I am writing now but I was just curious as I am just starting out with Python :) – Kyung Lee Feb 11 '19 at 19:36

1 Answers1

0

I usually use glob and dict to do so. This will asume your data is in .csv format, but shouldn't really matter to the idea:

You use glob to create a variable with all your files. Say they are in a folder called Data inside your main folder:

 data=glob.glob('Data/'+'*.csv') #Put every .csv file into a list 
 #you can change .csv with wathever you need
 dict_data={} #Create empty dictionary

for n,i in enumerate(sorted(data)):

    dict_data['file_'+str(n+1)]=pd.read_csv(i)

Here you can replace with your with...open statement. In the end you'll get a dict with keys file_1 ,..., file_n that will have inside your data. I find it the best way to work with lots of data. Might need to do some tinkering if you're working with more than one type of data, though.

Hope it helps

Juan C
  • 5,846
  • 2
  • 17
  • 51