I am trying to create multiple datasets using different oversampling methodologies from imblearn, name them based on a list, and then be able to call them later based on those names. When trying to name the outputs I am inadvertently adding the whole dataset to the list rather than assigning the string object from the list as the name for the dataset object.
I read through this post, but it isn't clear to me if / how it applies to my use case: How do I create variable variables?
I am using python 3.7.5
Where I think issue is popping up:
os_X_train_list[i], os_y_train_list[i] = sampler().fit_resample(X_train, y_train)
I tried wrapping each "name" in a str() so the interpreter would recognize it as such rather than a location in the list, but then got a syntax error.
Full(er) code:
import imblearn.over_sampling as imbos
oversampling_list = ['RandomOverSampler', 'SMOTE']
os_X_train_list = []
os_y_train_list = []
for i in oversampling_list:
os_X_train_list.append(i+'_X_train_resample')
os_y_train_lust.append(i+'_y_train_resample')
df_sample_list = []
for i in range(len(overampling_list)):
print(f'Oversampling using {oversampling_list[i]}')
sampler = getattr(imbos, over_sampling_list[i])
os_X_train_list[i], os_y_train_list[i] = sampler().fit_resample(X_train, y_train)
df_sample_list.append(os_X_train_list[i])
df_sample_list.append(os_y_train_list[i])
Thanks in advance for your help