I have a pandas data frame with 4 columns, A, B, C, D:
A B C D
1 0 2 ["apple", "pear", "peach"]
2 2 3 ["cherry"]
3 3 3 ["banana", "cherry"]
4 4 7 []
5 1 3 ["apple", "grapes"]
I want to get the following results by repeating each row n times, where n equals to the value of (column C - column B), and updating the value in column B with the updated value (+1 for each repeat).
A B C D
1 0 2 ["apple", "pear", "peach"]
1 1 2 ["apple", "pear", "peach"]
1 2 2 ["apple", "pear", "peach"]
2 2 3 ["cherry"]
2 3 3 ["cherry"]
3 3 3 ["banana", "cherry"]
4 4 7 []
4 5 7 []
4 6 7 []
4 7 7 []
5 1 3 ["apple", "grapes"]
5 2 3 ["apple", "grapes"]
5 3 3 ["apple", "grapes"]
Not sure in pandas how could I achieve this? Thanks.
edited Column D may not be unique to each B value, so group by may not work here.