Looping through a list and applying the same manipulation to them

Question

I have a dataframe which is essentially daily imports for each country in a given area. I need to make a new dataframe for each country.

Example;

Spain = df[['Date', 'Spain']]
Spain['Date']= pd.to_datetime(Spain['Date']) 
Spain = Spain.groupby([Spain.Date.dt.year,Spain.Date.dt.month]).mean()
Spain = Spain.unstack()
Spain = Spain/1745
Spain = Spain.round()

I am not very experienced with loops but is there any way I could create a list of all the countries and make a loop that would loop through that list to save me from writing the above code for each of the 100+ countries?

Any help much appreciated.

Is possible add some data sample 3 rows, 3 columns ? finally need one big DataFrame? — jezrael, Mar 23 '21 at 09:43

jezrael · Accepted Answer · 2021-03-23T10:01:11.503

1

First rule in pandas is try dont loop.

I think you need DataFrame.melt for unpivot first:

df1 = df.melt('Date', var_name='country', value_name='val')
df1['Date']= pd.to_datetime(df1['Date']) 
df2 = (df1.groupby(['country', df1.Date.dt.year,df1.Date.dt.month])['val']
           .mean()
           .unstack()
           .div(1745)
           .round())

edited Mar 23 '21 at 10:01

answered Mar 23 '21 at 09:42

jezrael

822,522
95
1,334
1,252

Thanks for answer, is this using Python3? I'm on Python2 and getting the following error; "AttributeError: 'DataFrame' object has no attribute 'melt'" – spcol Mar 24 '21 at 15:37
@spcol Use `df1 = pd.melt(df, 'Date', var_name='country', value_name='val')` – jezrael Mar 24 '21 at 16:51

score 1 · Answer 2 · answered Mar 23 '21 at 10:00

I have never used pandas so I will try an alternative answer:

You can make code that will write all of the code for you with the countries specified in a list. For example:

country = ['Spain', 'Italy']
code = ''

for item in country:
    code += f"{item} = df[['Date', '{item}']]" + '\n'
    code += f"{item}['Date'] = pd.to_datetime({item}['Date'])" + '\n'
    code += f"{item} = {item}.groupby([{item}.Date.dt.year, {item}.Date.dt.month]).mean()" + '\n'
    code += f"{item} = {item}.unstack()" + '\n'
    code += f"{item} = {item}/1745" + '\n'
    code += f"{item} = {item}.round()" + '\n\n'

When you go to print(code) you will get the following:

Spain = df[['Date', 'Spain']]
Spain['Date'] = pd.to_datetime(Spain['Date'])
Spain = Spain.groupby([Spain.Date.dt.year, Spain.Date.dt.month]).mean()
Spain = Spain.unstack()
Spain = Spain/1745
Spain = Spain.round()

Italy = df[['Date', 'Italy']]
Italy['Date'] = pd.to_datetime(Italy['Date'])
Italy = Italy.groupby([Italy.Date.dt.year, Italy.Date.dt.month]).mean()
Italy = Italy.unstack()
Italy = Italy/1745
Italy = Italy.round()

Which you can then copy/paste to your actual program.

Looping through a list and applying the same manipulation to them

2 Answers2