Panads groupby and save certain columns to CSV

Question

I am reading a CSV with columns Employer, City, State, Zipcode and Jobtitle to pandas.

The requirement is to group by Employer + City, count the results and write four columns (Employer, City, Zipcode and Count) to CSV.

Here is what I have done so far,

data = pd.read_csv("jobs.csv")
data.groupby(["Employer", "City"]).count()

This gives me:

Employer    City       State    Zipcode   Jobtitle 
Emp1      Cincinnati     1        1          1   
Emp2      Delaware      14        0         14   
Emp3      Akron          1        0          1

What I want is:

Employer    City       Zipcode    Jobcount
Emp1      Cincinnati    12345         1  
Emp2      Delaware      22112        14  
Emp3      Akron         34567         1

Where Jobcount shows the number of jobs for the combination of Employer + City.

Looks like you need `data.groupby(['Employer', 'City', 'Zipcode'])['Jobcount'].count()` .. as described in the duplicate. — jpp, Apr 04 '18 at 17:05
This is not a duplicate question. data.groupby(['Employer', 'City', 'Zipcode'])['Jobcount'].count() gives error KeyError: 'Column not found: Jobcount' — kashaziz, Apr 04 '18 at 17:09

score 1 · Answer 1 · answered Apr 04 '18 at 17:07

1

If you're expecting 1 zipcode per employee/city, you can do:

data.groupby(['Employer', 'City', 'Zipcode']).agg({'Jobtitle': 'size'})
data.columns = ['Employer', 'City', 'Zipcode', 'Jobcount']

answered Apr 04 '18 at 17:07

RCA

508
4
12

This worked for me with a slight modification. Thanks. – kashaziz Apr 07 '18 at 13:09

Panads groupby and save certain columns to CSV

1 Answers1