2

I'm trying to create an ascending list of integers by group in pandas:

Group    Date
  A      2000-01-01
  A      2000-01-12
  A      2000-01-15
  A      2000-10-01
  B      2005-02-05
  B      2006-04-10
  B      2010-08-20

Would become this:

Group    Date          obs_num
  A      2000-01-01      1
  A      2000-01-12      2
  A      2000-01-15      3
  A      2000-10-01      4
  B      2005-02-05      1
  B      2006-04-10      2
  B      2010-08-20      3
Luke
  • 6,699
  • 13
  • 50
  • 88

2 Answers2

5

You can also do something slick with cumsum such as:

df['obs_num'] = 1
df['obs_num'] = df.groupby('Group')['obs_num'].cumsum()

My small test had this version as 6ms vs 14.8ms for Tom's solution.

thetainted1
  • 451
  • 3
  • 4
2

I think this will work for you. I need to check a slight detail.

Start by resetting the index. It looks like when grouping by level (on the index) the order is not preserved.

In [31]: df
Out[31]: 
             Date
Group            
A      2000-01-01
A      2000-01-12
A      2000-01-15
A      2000-10-01
B      2005-02-05
B      2006-04-10
B      2010-08-20

[7 rows x 1 columns]

In [32]: df = df.reset_index()

Now it's just a one-liner:

In [42]: df['obs_num'] = df.groupby('Group').transform(lambda x: np.arange(1, len(x) + 1))

In [43]: df
Out[43]: 
  Group        Date  obs_num
0     A  2000-01-01        1
1     A  2000-01-12        2
2     A  2000-01-15        3
3     A  2000-10-01        4
4     B  2005-02-05        1
5     B  2006-04-10        2
6     B  2010-08-20        3

[7 rows x 3 columns]
TomAugspurger
  • 28,234
  • 8
  • 86
  • 69
  • 1
    Thanks. Getting a ValueError though: Wrong number of items passed 1, indices imply 22 – Luke Jan 08 '14 at 21:33