0

One of the columns in the Dataframe is STANME (State name). I want to create a pandas series with index = STNAME and value = number of entries in DataFrame. E.g of sample output is shown below

STNAME
Michigan           83
Arizona            15
Wisconsin          72
Montana            56
North Carolina    100
Utah               29
New Jersey         21
Wyoming            23

My current solution is the following, but seems a but clumsy due to the need to pick arbitrary column, rename this column etc. Would like to know if there is a better way to do this

grouped=df.groupby('STNAME')
# Note: County is an arbitrary column name I picked from the dataframe
grouped_df = grouped['COUNTY'].agg(np.size)
grouped_df.columns = ['Num Counties']

1 Answers1

0

You can achieve this using value_counts(). This function is used to get a pd.Series containing counts of unique values:

freq = df['STANME'].value_counts()

The index will be STANME, and the value will be it's frequency (first element is the most frequently-occurring element).

Note that NA's will be excluded by default.

sophocles
  • 13,593
  • 3
  • 14
  • 33