0

I am reading a CSV with columns Employer, City, State, Zipcode and Jobtitle to pandas.

The requirement is to group by Employer + City, count the results and write four columns (Employer, City, Zipcode and Count) to CSV.

Here is what I have done so far,

data = pd.read_csv("jobs.csv")
data.groupby(["Employer", "City"]).count()

This gives me:

Employer    City       State    Zipcode   Jobtitle 
Emp1      Cincinnati     1        1          1   
Emp2      Delaware      14        0         14   
Emp3      Akron          1        0          1 

What I want is:

Employer    City       Zipcode    Jobcount
Emp1      Cincinnati    12345         1  
Emp2      Delaware      22112        14  
Emp3      Akron         34567         1 

Where Jobcount shows the number of jobs for the combination of Employer + City.

kashaziz
  • 461
  • 4
  • 11

1 Answers1

1

If you're expecting 1 zipcode per employee/city, you can do:

data.groupby(['Employer', 'City', 'Zipcode']).agg({'Jobtitle': 'size'})
data.columns = ['Employer', 'City', 'Zipcode', 'Jobcount']
RCA
  • 508
  • 4
  • 12