1

if I have a dataframe like this:

category name index 
A        A11   1
A        A12   1
A        A13   1
A        A21   2
A        A22   2
A        A23   2
...
B        B11   1
B        B21   2
...

I want to first group the dataframe by the category and index, and inside each group I want to shift the name column by the value of index (reversed shifting) - 1.

So the result would be like:

category name index 
A        A11   1    # Do not shift anything because index is 1
A        A12   1
A        A13   1
A        A22   2    # Shift -1 for anything in (A, 2) group because index here is 2
A        A23   2
A        A24   2
...
B        B11   1
B        NA    2  # at the end of each group it should be NA

I have tried to use .groupby() and then .apply() but it seems that there is no easy way to join the result back to the dataframe. It seems that the apply method can only return a series. How could I achieve this using the apply method?

Bob Fang
  • 6,963
  • 10
  • 39
  • 72

2 Answers2

1

If I understand your question

l=[y.assign(name=y.name.shift(-x[1]+1).values) for x , y in df.groupby(['category','index'])]

Newdf=pd.concat(l)
Newdf
Out[644]: 
  category name  index
0        A  A11      1
1        A  A12      1
2        A  A13      1
3        A  A23      2
4        A  A24      2
5        A  NaN      2
BENY
  • 317,841
  • 20
  • 164
  • 234
  • I am not saying this is a bad answer at all but this is using a for loop basically. I have thought of doing so but was stuck in trying to come up with a group_by, apply magic – Bob Fang Feb 22 '19 at 14:56
  • @dorafmon apply always have bad performance , do not all ways think for loop is bad , check https://stackoverflow.com/questions/54432583/when-should-i-ever-want-to-use-pandas-apply-in-my-code – BENY Feb 22 '19 at 14:59
0

One way I think you can do this, if I understand correctly:

Given df,

  category name  index
0        A  A11      1
1        A  A12      1
2        A  A13      1
3        A  A21      2
4        A  A22      2
5        A  A23      2
6        B  B11      1
7        B  B21      2

Use query to filter where index is not 1, then shift within groupby and concat part of filter dataframe where index equals 1 to shifted dataframe.

df_shifted = (df.query("index > 1").groupby(['category','index'])
                                   .apply(lambda x: x['name'].shift(-1))
                .reset_index().set_index('level_2').rename_axis(None))

df_out = pd.concat([df_shifted, df.query("index == 1")], sort=False).sort_index()
df_out

Output:

  category  index name
0        A      1  A11
1        A      1  A12
2        A      1  A13
3        A      2  A22
4        A      2  A23
5        A      2  NaN
6        B      1  B11
7        B      2  NaN
Scott Boston
  • 147,308
  • 15
  • 139
  • 187