In the Python Data Science Handbook the following example is given (the penultimate line is the one which I don't understand, as indicated):
import pandas as pd
import numpy as np
import seaborn as sns
sns.set()
#Downloaded from: https://raw.githubusercontent.com/jakevdp/data-CDCbirths/master/births.csv
births = pd.read_csv('births.csv')
births['decades'] = (births['year'] // 10) * 10
# Rhobust sigma clipping operation - ignore this
quartiles = np.percentile(births['births'], [25, 50, 75])
mu = quartiles[1]
sig = 0.74 * (quartiles[2] - quartiles[0])
births = births.query('(births > @mu - 5 * @sig) & (births < @mu + 5 * @sig)')
births['day'] = births['day'].astype(int)
births.index = pd.to_datetime(10000 * births.year +
100 * births.month +
births.day, format='%Y%m%d')
births_by_date = births.pivot_table('births', [births.index.month, births.index.day])
#Help on the loop below
births_by_date.index = [pd.datetime(2012, month, day)
for (month, day) in births_by_date.index]
print(births_by_date.index)
I don't understand the construction of the births_by_date.index in the for loop. I understand that the loop is getting applied to the pivot table, but I've never seen what looks like the output array put before the loop.
Can someone explain how this works, or direct me to an appropriate explanation please?
I have tried: How do I save results of a "for" loop into a single variable?
numberous tutorials such as this one: https://www.learnpython.org/en/Loops
various other questions, but I can't find anything similar.