How to count all the cells under a single category with Pandas Python

Question

How do I count all the cells under 'Category' with Pandas Python? I tried: df['Category'].value_counts() but that gives me this output:

Engineering & Information Technology        1159
Manufacturing                               1044
Vehicle Service                              915
Supply Chain                                 378
Energy - Solar & Storage                     374
Construction & Facilities                    296
Sales & Customer Support                     269
Finance                                      119
Charging                                     115
Environmental, Health & Safety                93
Autopilot & Robotics                          78
Operations & Business Support                 75
HR                                            64
Design                                        59
Vehicle Software                              40
Legal & Government Affairs                    18
External Relations & Employee Experience       2
Name: Category, dtype: int64

This output just gives me a breakdown of the number of occurrences of each label in 'Category'. What I want is simply the total number of occurrences under 'Category'. So essentially I want to add up all the numbers in the right column. How do I do that?

Here is what the original data looks like (all text):

Title   Category    Location
0   Technical Product Analyst   Engineering & Information Technology    Draper, Utah
1   Software Engineer   Engineering & Information Technology    Austin, Texas
2   Software Development Engineer   Engineering & Information Technology    Fremont, California
3   Global Supply Analyst   Supply Chain    Palo Alto, California
4   Software Support Engineer, Battery Automation ...   Engineering & Information Technology    Austin, Texas

Are you looking for something different than df['Category'].count() ? — Raid, Dec 25 '22 at 04:08
Doesn't `len(df)` do the trick ? Or am I missing something about your dataframe? — asimoneau, Dec 25 '22 at 04:36
Ding ding. That is what I was looking for - df['Category'].count() - didn't know I needed to put brackets. Thank you — Michelle, Dec 25 '22 at 04:53

score 0 · Answer 1 · answered Dec 25 '22 at 03:55

0

df.Category.sum() should do the trick.

answered Dec 25 '22 at 03:55

Igor Rivin

4,632
2
23
35

I thought that would do it. But it spits out: " 'Engineering & Information TechnologyEngineering & Information TechnologyEngineering & Information TechnologySupply ChainEngineering & Information TechnologyEngineering & Information TechnologyManufacturingManufacturingSupply ChainEnergy - Solar & StorageVehicle ServiceVehicle SoftwareEnergy - Solar & StorageManufacturingVehicle ServiceManufacturingVehicle ServiceFinanceVehicle ServiceEnvironmental, Health & SafetyEngineering & Information TechnologyEnvironmental, Health & SafetyManufacturingManufacturingSales & Customer SupportEnergy ..." – Michelle Dec 25 '22 at 03:57

score 0 · Answer 2 · answered Dec 25 '22 at 04:12

Example

we need reproducible and minimal example for answer. lets make

data = [['A', 'upper'], ['c', 'lower'], ['d', 'lower'], ['A', 'upper'], 
        ['B', 'upper'], ['e', 'lower'], ['d', 'lower'], ['d', 'lower']]
df = pd.DataFrame(data, columns=['col1', 'category'])

df

    col1    category
0   A       upper
1   c       lower
2   d       lower
3   A       upper
4   B       upper
5   e       lower
6   d       lower
7   d       lower

Code

out = df.groupby('category')['col1'].agg(pd.Series.nunique)

out

category
lower    3                 <-- c, d, e
upper    2                 <-- A, B
Name: col1, dtype: int64

Thank you for simplifying the example. Now looking for how to sum all that up. — Michelle, Dec 25 '22 at 04:21

How to count all the cells under a single category with Pandas Python

2 Answers2