0

I have a list of datasets that I want to use in a for loop to easily do the same operations on the whole list. I can't figure out how to stitch it together though.

I'm trying to do something like this:

datasets = ['baseline', 'chan_0_light', 'chan_1_light', 'chan_2_light', 'chan_3_light', 
            'chan_0_pain', 'chan_1_pain', 'chan_2_pain', 'chan_3_pain', 'drought']

for i in datasets:
    df_ + datasets[i] = pd.read_csv('datasets\\' + str(datasets[i]) + '.csv')
    df_ + datasets[i] + ['Datetime'] = pd.to_datetime(df_ + datasets[i] + ['Datetime'], format='%Y-%m-%d %H:%M:%S.%f')
    df_ + datasets[i] = df_ + datasets[i] + .set_index("Datetime")

instead of writing it all out like this:

df_baseline = pd.read_csv('datasets\\baseline.csv')
df_baseline['Datetime'] = pd.to_datetime(df_baseline['Datetime'], format='%Y-%m-%d %H:%M:%S.%f')
df_baseline = df_baseline.set_index("Datetime")

df_chan_0_light = pd.read_csv('datasets\\chan_0_light.csv')
df_chan_0_light['Datetime'] = pd.to_datetime(df_chan_0_light['Datetime'], format='%Y-%m-%d %H:%M:%S.%f')
df_chan_0_light = df_chan_0_light.set_index("Datetime")

df_chan_1_light = pd.read_csv('datasets\\chan_1_light.csv')
df_chan_1_light['Datetime'] = pd.to_datetime(df_chan_1_light['Datetime'], format='%Y-%m-%d %H:%M:%S.%f')
df_chan_1_light = df_chan_1_light.set_index("Datetime")

# and so on

I finally found a way that works. For others reference:

datasets = glob.glob("datasets" + "/*.csv")

dfs_dict = {}

for filename in datasets:
    df = pd.read_csv(filename)
    df['Datetime'] = pd.to_datetime(df['Datetime'], format='%Y-%m-%d %H:%M:%S.%f')
    df = df.set_index("Datetime")
    dfs_dict["df_" + str(filename)[9:-4]] = df
  • You don't want to do this. Use a dict instead. – MattDMo Dec 17 '20 at 20:27
  • Don't; make a dictionary where `datasets[i]` is a key rather than trying to dynamically construct a variable name. – chepner Dec 17 '20 at 20:27
  • Or rather, `i`. `i` is already an element of `datasets`, not an index. – chepner Dec 17 '20 at 20:28
  • `datasets = {'df_baseline': 'baseline', 'df_chan_0_light': 'chan_0_light'} for key, value in datasets.items(): key = pd.read_csv('datasets\\' + str(value) + '.csv') key['Datetime'] = pd.to_datetime(key['Datetime'], format='%Y-%m-%d %H:%M:%S.%f') key = key.set_index("Datetime")` didn't work but maybe closer to an answer? – coffeeandcigarettes Dec 18 '20 at 00:39

0 Answers0