Pandas: how to count the frequency of words in a column based on another column

Question

I have a csv file containing 4 columns, where column1 is car_id, column2 is manufacturer, column3 is car_year and column4 is the car model as below. I would like to use pandas dataframe to only count the model in coumn4 if the make in column2 is 'Ford'.

column 1    column2   column3   column4
 005         Ford      2012      Mustang
 125         Ford      2020      Focus
 223         BMW       2017      X5
 115         Ford      2015      Focus
 566         Kia       2011      ceed

So the output be something like

model    counts
Mustang  1
Focus    2

Any help. Thank you

I'mahdi · Accepted Answer · 2021-11-14T14:37:38.060

1

You can first select base on column2 then use .values_counts(). try this:

>>> df[df['column2'] == 'Ford']['column4'].value_counts()

As DataFrame:

>>> pd.DataFrame(df[df['column2'] == 'Ford']['column4'].value_counts()
                ).reset_index()\
.rename(columns={'index':'model', 'column4':'counts'}
)
    model       counts
0   Ford        2
1   Mustang     1

edited Nov 14 '21 at 14:37

answered Nov 14 '21 at 14:31

I'mahdi

23,382
5
22
30

Pandas: how to count the frequency of words in a column based on another column

1 Answers1