-2

I'm looking to create a function that will import csv files based on a user input of file names that were created as a list. This is for some data analysis where i will then use pandas to resample the data etc and calculate the percentages of missing data. So far I have:

parser = lambda x: pd.datetime.strptime(x, '%d/%m/%Y %H:%M')
number_stations = input(" Please tell how many stations you want to analyse: ")
list_of_stations_name_number = []
i = 0
while i < int(number_stations):
    i += 1
    name = input(" Please the stations name for station number {}: ".format(i))
    list_of_stations_name_number.append(name+ '.csv')

This works as intended whereby, the user will add the name of the stations they are looking to analyse and then will be left with a list located in list_of_stations_name_number. Such as:

list_of_stations_name_number "['DM00115_D.csv', 'DM00117_D.csv', 'DM00118_D.csv', 'DM00121_D.csv', 'DM00129_D.csv']"

Is there any easy way for which i can then redirect to the directory (using os.chdir) and import the csv files based on them matching names. I'm not sure how complicated or simple this would be and am open to try more efficient methods if applicable

  • Are all your csv files in the same directory? If yes, you can read them all in by following https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe – Mortz Apr 20 '21 at 10:51
  • Yes they would be in the same location, but would this still work if only you wanted to check one or two of the files in the directory, rather then everything in the folder. – Adam Bermingham Apr 20 '21 at 10:52

1 Answers1

0

To read all files, you can do something like -

list_of_dfs = [pd.read_csv(f) for f in list_of_stations_name_number]

list_of_dfs[0] will correspond to the csv file list_of_stations_name_number[0]

If your files are not in the current directory, you can prepend the directory path to the file names -

list_of_stations_name_number = [f'location/to/folder/{fname}' for fname in list_of_stations_name_number]
Mortz
  • 4,654
  • 1
  • 19
  • 35
  • This seems it might work, but would this then just put all the csv files from different stations into the one file called df? Thanks for your help! – Adam Bermingham Apr 20 '21 at 11:03
  • Yes, that is correct- all files in the `list_of_stations_name_number` will be read into the DataFrame `df`. Is that not what you want? – Mortz Apr 20 '21 at 11:05
  • Not entirely but it's definitely closer. It was more with the aim to get df to equal list element 1, df2 = list element 2 etc, so i could have 3 data frames for 3 csv files. sorry i probably wasn't clear. Hence the vote down – Adam Bermingham Apr 20 '21 at 11:10
  • Then you don't need the concatenation - see edited answer above – Mortz Apr 20 '21 at 11:17