I have this a Dataframe with 200.000 rows that looks like this:
import pandas as pd
inp = [{'guess':173, 'start_date':'2004-05-06', 'end_date':'2004-05-06'}, {'guess':173, 'start_date':'2018-07-06', 'end_date':'2018-07-05'},
{'guess':347, 'start_date':'2011-05-30', 'end_date':'2018-10-09'}, {'guess':347, 'start_date':'2011-10-27 ', 'end_date':'2099-01-01'},
{'guess':347, 'start_date':'2015-12-29', 'end_date':'2099-01-01'},{'guess':347, 'start_date':'2016-01-05', 'end_date':'2099-01-01'},
{'guess':347, 'start_date':'2018-11-02', 'end_date':'2099-01-01'}]
df = pd.DataFrame(inp)
df.head()
Now I want to iterate over the rows of this frame. First off all i want to check if there are other guess with the same ID, in that case, i want identify how many of products the guess have active at the time it buy the product.
The output what i am looking for is:
Output:
Guess start_date end_date Counter
0 1734 2004-05-06 2018-05-05 0
1 1734 2018-07-06 2099-01-01 0 it is 0 because when he buy the 2 item, the first is deleted
2 3470 2011-05-30 2018-10-09 0
3 3470 2011-10-27 2099-01-01 1
4 3470 2015-12-29 2099-01-01 2
5 3470 2016-01-05 2099-01-01 3
6 3470 2018-11-02 2099-01-01 3 it happend the same in line 1
I have been trying with "iterrows()" but it is too big for it.