Aggregate DataFrame over Index

Question

I have following DataFrame

                    (polygon object)     ASSAULT     BURGLARY   bank     cafe    crossing
INCIDENTDATE                                                                            
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-01 02:00:00                A           1           0       1        0           0
2009-01-01 02:00:00                A           1           0       0        0           1
2009-01-01 02:00:00                A           1           0       0        1           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       0        0           0
2009-01-04 11:00:00                B           0           1       1        0           0
2009-01-04 11:00:00                B           0           1       0        1           0

I want to aggregate that DataFrame to only have unique 'INCIDENTDATE'

while doing this I want the value of each column (except polygon) to be 1 if it was 1 in at least one row of same 'INCIDENTDATE' rows.

The final DataFrame should look like this:

                    (polygon object)    ASSAULT     BURGLARY    bank     cafe    crossing
INCIDENTDATE                                                                            
2009-01-01 02:00:00                A           1           0       1        1           1
2009-01-04 11:00:00                B           0           1       1        1           0

How would i achieve that in pandas? Googling my question pointed me to the groupby() function but I really dont understand how i would use it here.

You can group dataframe by index, df.groupby(df.index).max() — Vaishali, Dec 18 '18 at 17:52

score 3 · Answer 1 · answered Dec 18 '18 at 17:48

3

I think just reset in the index then groupby that new column and look for the max values of each group:

df.reset_index(inplace=True)
df.groupby('INCIDENTDATE').max()

answered Dec 18 '18 at 17:48

it's-yer-boy-chet

1,917
2
12
21

score 2 · Answer 2 · answered Dec 18 '18 at 17:47

2

The max function should do this:

df.groupby("INCIDENTDATE").agg("max")

answered Dec 18 '18 at 17:47

Tim

2,756
1
15
31

Aggregate DataFrame over Index

2 Answers2