I have separated my data into four dataframes (SE_df, SO_df, etc; each matching a specific pattern on one of the columns of the original data). I want to run the same process on each of the four.
I tried something like this (Note: the actual process is longer, but it starts with value_counts
.)
lookup_table = {
'SE': [SE_df, 'SE_df_counts'],
'SO': [SO_df, 'SO_df_counts'],
'OC': [OC_df, 'OC_df_counts'],
'CW': [CW_df, 'CW_df_counts']
}
for site in ['SE', 'SO', 'OC', 'CW']:
dframe_in = lookup_table[site][0]
dframe_out = lookup_table[site][1]
dframe_out = dframe_in.apply(pd.value_counts)
# ...
When the loop finishes, I want four new DataFrames: SE_df_counts, SO_df_counts, ... etc
Instead, I have one new DataFrame named dframe_out
.
I initially tried to use
lookup_table = {
'SE': [SE_df, SE_df_counts],
...
}
but python complained that a DataFrame named SE_df_counts
didn't exist yet.
I tried forcing it to exist (whoch made the code more brittle, but it was worth a shot).
SE_df_counts = pd.DataFrame()
lookup_table = {
'SE': [df_SE_mic, SE_df_counts],
}
for site in ['SE']:
dframe_in = lookup_table[site][0]
dframe_out = lookup_table[site][1]
dframe_out = dframe_in.apply(pd.value_counts)
and I still ended up with a DataFrame named dframe_out (which really confused me).
Is there a way to pass the desired name of a dataframe as a variable (or a dictionary value)? I see many tempting recommendations here where people say 'Use a dictionary" but the examples are always more complex than what I'm trying to do and the ultimate answers often provide alternate ways around the question. (e.g. this question was very close, but the chosen answer wasn't relevant to my use case.)