How to shift a column in pandas dataframe group and then join back to be a single dataframe?

Question

if I have a dataframe like this:

category name index 
A        A11   1
A        A12   1
A        A13   1
A        A21   2
A        A22   2
A        A23   2
...
B        B11   1
B        B21   2
...

I want to first group the dataframe by the category and index, and inside each group I want to shift the name column by the value of index (reversed shifting) - 1.

So the result would be like:

category name index 
A        A11   1    # Do not shift anything because index is 1
A        A12   1
A        A13   1
A        A22   2    # Shift -1 for anything in (A, 2) group because index here is 2
A        A23   2
A        A24   2
...
B        B11   1
B        NA    2  # at the end of each group it should be NA

I have tried to use .groupby() and then .apply() but it seems that there is no easy way to join the result back to the dataframe. It seems that the apply method can only return a series. How could I achieve this using the apply method?

Is this a roll, so that `A21` goes to the end of the group? The shift is by index-1? — ALollz, Feb 22 '19 at 14:41
I think its more like a groupby and substract index to the numerical part of name — yatu, Feb 22 '19 at 14:44
@ALollz updated the question. It is a shift by -1 for 2, shift -2 for 3 etc. — Bob Fang, Feb 22 '19 at 14:44
But what happens at the end of the groups? Should the values be wrapped around, like in a `roll` or are they filled with `NaN`? You haven't shown enough information to infer that. — ALollz, Feb 22 '19 at 14:48

score 1 · Accepted Answer · answered Feb 22 '19 at 14:48

1

If I understand your question

l=[y.assign(name=y.name.shift(-x[1]+1).values) for x , y in df.groupby(['category','index'])]

Newdf=pd.concat(l)
Newdf
Out[644]: 
  category name  index
0        A  A11      1
1        A  A12      1
2        A  A13      1
3        A  A23      2
4        A  A24      2
5        A  NaN      2

answered Feb 22 '19 at 14:48

BENY

317,841
20
164
234

I am not saying this is a bad answer at all but this is using a for loop basically. I have thought of doing so but was stuck in trying to come up with a group_by, apply magic – Bob Fang Feb 22 '19 at 14:56
@dorafmon apply always have bad performance , do not all ways think for loop is bad , check https://stackoverflow.com/questions/54432583/when-should-i-ever-want-to-use-pandas-apply-in-my-code – BENY Feb 22 '19 at 14:59

score 0 · Answer 2 · answered Feb 22 '19 at 15:11

One way I think you can do this, if I understand correctly:

Given df,

  category name  index
0        A  A11      1
1        A  A12      1
2        A  A13      1
3        A  A21      2
4        A  A22      2
5        A  A23      2
6        B  B11      1
7        B  B21      2

Use query to filter where index is not 1, then shift within groupby and concat part of filter dataframe where index equals 1 to shifted dataframe.

df_shifted = (df.query("index > 1").groupby(['category','index'])
                                   .apply(lambda x: x['name'].shift(-1))
                .reset_index().set_index('level_2').rename_axis(None))

df_out = pd.concat([df_shifted, df.query("index == 1")], sort=False).sort_index()
df_out

Output:

  category  index name
0        A      1  A11
1        A      1  A12
2        A      1  A13
3        A      2  A22
4        A      2  A23
5        A      2  NaN
6        B      1  B11
7        B      2  NaN

How to shift a column in pandas dataframe group and then join back to be a single dataframe?

2 Answers2