0

I have the following reproducible code in which I create a dictionary, I group it by the factor Metropolitan Area and I use the agg() function to determine the mean by factor:

dictionaryMLB = {'Metropolitan area': ['New York City','New York City','Los Angeles', 'Los Angeles', 'San Francisco Bay Area','San Francisco Bay Area','Chicago','Chicago'],
              'Population (2016 est.)[8]': [20153634, 20153634, 13310447, 13310447,6657982,6657982,9512999,9512999],
              'MLB':['Yankees','Mets','Dodgers','Angels','Giants','Athletics','Cubs','White Sox']}

df = pd.DataFrame(dictionaryMLB)

df.groupby('Metropolitan area').agg([np.mean])

My output is the following:

                     Population (2016 est.)[8]
                           mean
Metropolitan area   
Chicago                   9512999
Los Angeles               13310447
New York City             20153634
San Francisco Bay Area    6657982

I would like to avoid the double name in the column, and just keeping either Population (2016 est.)[8] or mean to obtain, for example, the following:

                            mean
Metropolitan area   
Chicago                   9512999
Los Angeles               13310447
New York City             20153634
San Francisco Bay Area    6657982

How should I proceed?

Mauro
  • 477
  • 1
  • 9
  • 22

0 Answers0