I'm having a pandas issue.
I have a dataframe that looks like the following:
name date
Mark 2018-01-01
Anne 2018-01-01
Anne 2018-02-01
Anne 2018-04-01
Anne 2018-09-01
Anne 2019-01-01
John 2018-02-01
John 2018-06-01
John 2019-02-01
Ethan 2018-03-01
And I need to compute an additional column, call it months
, that contains the number of months between each consecutive row, for the same name. This is in order to calculate how many months have gone by between one appereance of a name and the next one in the DataFrame. The calculation must be done grouping by name, as I only want to know the number of months between consecutive appereances for one user, not for different names.
The expected output in this case would be:
name date months
Mark 2018-01-01 0
Anne 2018-01-01 0
Anne 2018-02-01 1
Anne 2018-04-01 2
Anne 2018-09-01 5
Anne 2019-01-01 4
John 2018-02-01 0
John 2018-06-01 4
John 2019-02-01 8
Ethan 2018-03-01 0
Any tips on what is the most efficient way of computing the months
column will be highly appreciated.
Note that The dates always have the first day of the month, which makes the calculation much easier.