Dataframe is not defined when trying to concatenate in loop (Python - Pandas)

Question

Consider the following list (named columns_list):

['total_cases',
 'new_cases',
 'total_deaths',
 'new_deaths',
 'total_cases_per_million',
 'new_cases_per_million',
 'total_deaths_per_million',
 'new_deaths_per_million',
 'total_tests',
 'new_tests',
 'total_tests_per_thousand',
 'new_tests_per_thousand',
 'new_tests_smoothed',
 'new_tests_smoothed_per_thousand',
 'tests_units',
 'stringency_index',
 'population',
 'population_density',
 'median_age',
 'aged_65_older',
 'aged_70_older',
 'gdp_per_capita',
 'extreme_poverty',
 'cvd_death_rate',
 'diabetes_prevalence',
 'female_smokers',
 'male_smokers',
 'handwashing_facilities',
 'hospital_beds_per_thousand',
 'life_expectancy']

Those are columns in two dataframes: US (df_us) and Canada (df_canada). I would like to create one dataframe for each item in the list, by concatenating its corresponding column from both df_us and df_canada.

for i in columns_list:
    
    df_i = pd.concat([df_canada[i],df_us[i]],axis=1)

Yet, when I type

df_new_deaths

I get the following output: name 'df_new_deaths' is not defined

Why?

https://stackoverflow.com/questions/40973687/create-new-dataframe-in-pandas-with-dynamic-names-also-add-new-column — David Erickson, Jul 14 '20 at 22:48
use the list entries as an id column and store one big df, or create a dictionary of dataframes — Derek Eden, Jul 14 '20 at 22:49
`df_i = ...` will not create `df_new_deaths` but only `df_i` - you should use dictionary `dfs = dict()` and then `dfs[i] = ...` and you will have `dfs["new_deaths"]` — furas, Jul 15 '20 at 01:41

Trenton McKinney · Accepted Answer · 2020-07-14T22:57:20.530

You're not actually saving the dataframes
df_new_deaths is never defined
Add the dataframe of each column to a list and access it by index
Also, since only one column is being concated, you will end up with a pandas Series, not a DataFrame, unless you use pd.DataFrame

df_list = list()
for i in columns_list:
    
    df_list.append(pd.DataFrame(pd.concat([df_canada[i],df_us[i]],axis=1)))

add the dataframes to a dict, where the column name is also the key

df_dict = dict()
for i in columns_list:
    
    df_dict[i] = pd.DataFrame(pd.concat([df_canada[i],df_us[i]],axis=1))

Dataframe is not defined when trying to concatenate in loop (Python - Pandas)

1 Answers1