1

I'm attempting to dig through my computer and plot a bunch of CSVs on one plot (I'm using Python 2.7 and Pandas).

While all the CSV files have the same name of file.csv, they are located in a myriad of different folders. I've done the following below where I wrap the CSVs into a dataframe and then plot the dataframe from a certain range of values.

I would like to label each plot as the folder name (i.e. have the legend specify the folder directory that the CSV is located in)

import pandas as pd
from pandas import read_csv
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
import os


class do(object):


   def something(self):

     style.use('ggplot')

     file_1 = r'C:\User\me\PathABC\Folder123\file.csv'
     file_2 = r'C:\User\me\PathABC\Folder456\file.csv'
     file_3 = r'C:\User\me\PathABC\Folder789\file.csv'
     file_4 = r'C:\User\me\PathABC\Folder101112\file.csv'




     df1 = pd.read_csv(file_1,header=None)
     df2 = pd.read_csv(file_2,header=None)
     df3 = pd.read_csv(file_3,header=None)
     df4 = pd.read_csv(file_4,header=None)


     plt.plot(df1[0],df1[1],label='Folder123')
     plt.plot(df2[0],df2[1],label='Folder456')
     plt.plot(df3[0],df3[1],label='Folder789')
     plt.plot(df4[0],df4[1],label='Folder101112')


     plt.xlim([200000,800000])

     plt.legend()
     plt.ylabel('Amplitude')
     plt.xlabel('Hz')

     plt.grid(True,color='k')

     plt.show()


  x=do()
  x.something()

essentially, i would like to automate this process such that I can parse my computer by using the following logic:

where file.csv exists, plot it
label plot with folder name of where file.csv came from
Devin Liner
  • 419
  • 1
  • 5
  • 11

2 Answers2

1

Walking a file path is one answer, but you may be able to use glob.glob in simpler cases where the target folders are all at the same depth in the filesystem. For example,

for filename in glob.glob('somewhere/sheets/*/file.csv')

will iterate over all files called file.csv in any subfolder of somewhere/sheets. If they are all two levels down, glob.glob('somewhere/sheets/*/*/file.csv') will work, and if they are all one or two levels down, you can join the lists from two glob invocations.

nigel222
  • 7,582
  • 1
  • 14
  • 22
0

Take a look at How to list all files of a directory? by @pycruft and edited by @Martin Thoma. I would use walk to get the full path of all csv files existing in several folders inside a specific path as follows:

from os import walk
from os.path import join,splitext
f = []
for (dirpath, dirnames, filenames) in walk(specific_path):
    for filename in filenames:
        if splitext(filename)[1].upper() == '.CSV':
            f.extend([join(dirpath,filename)])
Community
  • 1
  • 1
Pablo Reyes
  • 3,073
  • 1
  • 20
  • 30