3

df:

id1     id2     value1    value2
-----------------------------------
a       b       10        5
c       a       5         10
b       c       0         0
c       d       2         1
d       a       10        20
a       c       5         10

get sum of values associated with id 'a' from column ['id1','id2']:

id1     id2     a.rolling(2).sum()
-----------------------------------
a       b       NaN
c       a       20
d       a       30
a       c       25

How would I get the rolling sum of values of id 'a' from two different column with a df.groupby function?

I tried this df.groupby(['id1','id2])['value1','value2'].transform(lambda x: x.rolling(2).sum()), but that did't work.

Chipmunkafy
  • 566
  • 2
  • 5
  • 17
  • In your initial `df` all of the groups are single rows, it would be helpful to get your expected output. See [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for how to format a good question. – Alex Oct 30 '18 at 21:29

2 Answers2

5

Here's one way to do it

i = df.filter(like='id')
v = df.filter(like='va')

x, y = np.where(i == 'a')

df.iloc[x].assign(A=v.values[x, y]).assign(Roll=lambda d: d.A.rolling(2).sum())

  id1 id2  value1  value2   A  Roll
0   a   b      10       5  10   NaN
1   c   a       5      10  10  20.0
4   d   a      10      20  20  30.0
5   a   c       5      10   5  25.0
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • This is what I needed, but I forgot to specify to include all other id values without having to assign it manually. For example, a column for rolling sum for id ‘a’, another column for id ‘b’ and so on. How should I do that ? – Chipmunkafy Oct 30 '18 at 21:51
  • The rows that `'b'` are on might be different rows. I suggest building a good example to demonstrate what you are after and show what you'd like it to look like. And this should be in a new question. – piRSquared Oct 30 '18 at 21:54
2

Using concat after filter

df1=df.filter(like='1')
df2=df.filter(like='2')
df2.columns=df1.columns
s=pd.concat([df1,df2]).sort_index().groupby('id1').rolling(2).sum()
s=s.loc['a']
df.loc[s.index].assign(new=s)
Out[99]: 
  id1 id2  value1  value2   new
0   a   b      10       5   NaN
1   c   a       5      10  20.0
4   d   a      10      20  30.0
5   a   c       5      10  25.0
BENY
  • 317,841
  • 20
  • 164
  • 234