I have the following reproducible code in which I create a dictionary, I group it by the factor Metropolitan Area and I use the agg()
function to determine the mean by factor:
dictionaryMLB = {'Metropolitan area': ['New York City','New York City','Los Angeles', 'Los Angeles', 'San Francisco Bay Area','San Francisco Bay Area','Chicago','Chicago'],
'Population (2016 est.)[8]': [20153634, 20153634, 13310447, 13310447,6657982,6657982,9512999,9512999],
'MLB':['Yankees','Mets','Dodgers','Angels','Giants','Athletics','Cubs','White Sox']}
df = pd.DataFrame(dictionaryMLB)
df.groupby('Metropolitan area').agg([np.mean])
My output is the following:
Population (2016 est.)[8]
mean
Metropolitan area
Chicago 9512999
Los Angeles 13310447
New York City 20153634
San Francisco Bay Area 6657982
I would like to avoid the double name in the column, and just keeping either Population (2016 est.)[8]
or mean
to obtain, for example, the following:
mean
Metropolitan area
Chicago 9512999
Los Angeles 13310447
New York City 20153634
San Francisco Bay Area 6657982
How should I proceed?