0

I'm working in Python and I have a Pandas DataFrame of Uber data from New York City. A part of the DataFrame looks like this:

    Year Week_Number    Total_Dispatched_Trips      
    2015    51          1,109
    2015    5           54,380
    2015    50          8,989
    2015    51          1,025
    2015    21          10,195
    2015    38          51,957
    2015    43          266,465
    2015    29          66,139
    2015    40          74,321
    2015    39          3
    2015    50          854

As it is right now, the same week appears multiple times for each year. I want to sum the values for "Total_Dispatched_Trips" for every week for each year. I want each week to appear only once per year. (So week 51 can't appear multiple times for year 2015 etc.). How do I do this? My dataset is over 3k rows, so I would prefer not to do this manually.

Thanks in advance.

P. Faske
  • 3
  • 2

1 Answers1

1

okidoki here is it, borrowing on Convert number strings with commas in pandas DataFrame to float

import locale
from locale import atof
locale.setlocale(locale.LC_NUMERIC, '')

df['numeric_trip'] = pd.to_numeric(df.Total_Dispatched_trips.apply(atof), errors = 'coerce')
df.groupby(['Year', 'Week_number']).numeric_trip.sum()
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235