How to shift a cell depending on a value in another column in python?

Question

I have a dataframe as follows:

id | time
1  | 10:21
1  | 10:22
1  | 10:23
2  | 10:40
2  | 10:45
2  | 10:50

I would like to add a new column as follows:

id | time  | new_time
1  | 10:21 | 10:22
1  | 10:22 | 10:23
1  | 10:23 | None
2  | 10:40 | 10:45
2  | 10:45 | 10:50
2  | 10:50 | None

That means, I would like to create the new column by matching the values in the id column. For example, if the id values for two consecutive rows are the same then I would like to add the value in the new_time (in the first column) from the time value of the second column. If the id values are not the same then I would like to add None for the new_time value.How can I achieve this using python or pandas?

Possible duplicate of [Adding new column to existing DataFrame in Python pandas](https://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas) — Rafael Barros, Feb 25 '19 at 15:43
I am sorry for not explaining it clearly. I would like to create the new column by matching the values in the `id` column. For example, if the `id` values for two consecutive rows are the same then I would like to add the value in the `new_time` (in the first column) from the `time` value of the second column. If the `id` values are not the same then I would like to add `None` for the `new_time` value. — Rafi, Feb 25 '19 at 15:48
@RafaelBarros: I am not sure if your provided link resolves my issue. I would like to use (by shifting and matching) the existing columns to calculate the new column. — Rafi, Feb 25 '19 at 15:54

score 6 · Accepted Answer · answered Feb 25 '19 at 16:07

Use .shift() to get next record:

df['new_time'] = df.shift(-1).time

Results:

    id  time    new_time
0   1   10:21   10:22
1   1   10:22   10:23
2   1   10:23   10:40
3   2   10:40   10:45
4   2   10:45   10:50
5   2   10:50   NaN

Then assign np.NaN to the last row for each id

df.loc[df.groupby('id', as_index= False).nth([-1]).index, 'new_time'] = np.NaN

Results:

id  time    new_time
0   1   10:21   10:22
1   1   10:22   10:23
2   1   10:23   NaN
3   2   10:40   10:45
4   2   10:45   10:50
5   2   10:50   NaN

Thank you for your comment. It was helpful. I solved my issue in the following way: `df['new_time'] = df.groupby(['id'])['time'].shift(-1)` — Rafi, Feb 25 '19 at 16:21

score 2 · Answer 2 · answered Feb 25 '19 at 16:10

2

Assuming ID is currently the index, I would try:

df = df.reset_index()
shifted = df.shift(-1)
df['new_time'] = shifted.time
df.loc[df.id != shifted.id, "new_time"] = None
df = df.set_index("id")

If it's not an index you can just skip the reset_index and set_index lines.

It basically shifts the entire dataframe, matches them up together and wherever IDs are no longer the same it sets those values to None.

answered Feb 25 '19 at 16:10

Pluckerpluck

731
6
21

1

Thank you for your comment. It was helpful. I solved my issue in the following way: `df['new_time'] = df.groupby(['id'])['time'].shift(-1)` – Rafi Feb 25 '19 at 16:20
2

I did not know that pandas let you shift on a group... That is good to know – Pluckerpluck Feb 25 '19 at 16:46

How to shift a cell depending on a value in another column in python?

2 Answers2