3

I have a GroupBy object with row indexes that are integers.

light   worst_injury count
1       5            10217
2       5            4067
3       5            2142
4       5            1690
5       5            25848
6       5            734
9       5            18

I would like to re-name the rows (not the columns!) so that the 'light' column contains specific strings:

light    worst_injury    count
Day             5        10217
Dawn            5        4067
Dusk            5        2142
Dark- lit       5        1690
Dark- unlit     5        25848
Other           5        734
Unknown         5        18

I have a list of the strings that correspond to each number ['Day', 'Dawn', etc.] but I don't know how to set them as the index before or during the GroupBy function call. I've also tried making a pivot table but it doesn't seem possible to do it that way either for the same reasons.

I suppose I could write a script to change the original data to these strings, rather than the numbers. That seems like a less efficient way to do it, but I'm open to that option if there's no way to change the groupby object after or before the fact.

Here's the existing code; it groups the dataframe by light and each injury level, and then takes a count:

df = pd.read_csv(filename, sep='|', usecols=['crash_deer_involv_assoc', 'worst_injury_in_accident', 'light', 'accident_month'])

for i in range(1,6):
    inj = df[(df['worst_injury_in_accident'] == i)] 
    grouped = inj.groupby(['light','worst_injury_in_accident'])
    grouped.agg('count')
ale19
  • 1,327
  • 7
  • 23
  • 38

1 Answers1

5

IIUC you can use map by dictionary d:

#maybe first reset index
df = df.reset_index()

print df
   light  worst_injury  count
0      1             5  10217
1      2             5   4067
2      3             5   2142
3      4             5   1690
4      5             5  25848
5      6             5    734
6      9             5     18

d = {1:'Day',2:'Dawn', 3:'Dusk',4:'Dark- lit',5:'Dark- unlit',6:'Other',9:'Unknown'}

df['light'] = df.light.map(d)
print df
         light  worst_injury  count
0          Day             5  10217
1         Dawn             5   4067
2         Dusk             5   2142
3    Dark- lit             5   1690
4  Dark- unlit             5  25848
5        Other             5    734
6      Unknown             5     18

If column light is index:

print df
       worst_injury  count
light                     
1                 5  10217
2                 5   4067
3                 5   2142
4                 5   1690
5                 5  25848
6                 5    734
9                 5     18

d = {1:'Day',2:'Dawn', 3:'Dusk',4:'Dark- lit',5:'Dark- unlit',6:'Other',9:'Unknown'}

df.index = df.index.to_series().map(d)
print df
             worst_injury  count
light                           
Day                     5  10217
Dawn                    5   4067
Dusk                    5   2142
Dark- lit               5   1690
Dark- unlit             5  25848
Other                   5    734
Unknown                 5     18

EDIT:

For aggregating you can add parameter as_index=False to groupby and call count - output is nice DataFrame and reset_index in not necessary:

print df.groupby(['light','worst_injury_in_accident'], as_index=False).count()

Another option is use size (for me as_index not work, so you need call reset_index)

print df.groupby(['light','worst_injury_in_accident']).size().reset_index(name='count')

BTW, differences: size includes NaN values, count does not.

Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Your first solution worked perfectly! reset_index() seems very useful. Initially I got an Attribute Error "'DataFrame' object has no attribute 'light'", but when I changed `df.light.map` to `df['light'].map`, that fixed it. Thanks for your help! – ale19 May 06 '16 at 12:59
  • 1
    Thank you. I try add better solution for aggregating, see my edit. – jezrael May 06 '16 at 13:02
  • That's exactly what I was looking for-- some tiny little option or argument that I had missed in my search. Thanks for the detailed answer! I've accepted it. – ale19 May 06 '16 at 13:16