I want to create a new column in a pandas dataframe based on values found on a previous row.
Specifically I want to add a column with the difference, in days, between the date found on the actual row and the date found on the last, among previous rows, with the same userId and amount > 0.
I have this:
+--------+------------+-----------+
| UserId | Date | Amount |
+--------+------------+-----------+
| 1 | 2017-01-01 | 0 |
| 1 | 2017-01-03 | 10 |
| 2 | 2017-01-04 | 20 |
| 2 | 2017-01-07 | 15 |
| 1 | 2017-01-09 | 7 |
+--------+------------+-----------+
And I want this
+--------+------------+-----------+-------------+
| UserId | Date | Amount | Difference |
+--------+------------+-----------+-------------+
| 1 | 2017-01-01 | 0 | -1 |
| 1 | 2017-01-03 | 10 | -1 |
| 2 | 2017-01-04 | 20 | -1 |
| 2 | 2017-01-07 | 15 | 3 |
| 1 | 2017-01-09 | 7 | 6 |
+--------+------------+-----------+-------------+