0

I need to make groupby depending on a condition which includes two columns, so for example, I have the following dataframe:

item   start_date    end_date
A      10/03/2019    31/03/2019
B      10/03/2019    31/03/2019
A      31/03/2019    10/04/2019
B      31/03/2019    10/04/2019

and I need to groupbyon items if the end_date equals to the start_date and resulting row should have the start_date of the first row and the end_date of the second row

item   start_date    end_date
A      10/03/2019    10/04/2019
B      10/03/2019    10/04/2019

An easier example:

item   start_date    end_date
A          a             b
A          b             c
A          d             e
A          e             f

The wanted result:

   item   start_date    end_date
    A          a             c
    A          d             f
Rodwan Bakkar
  • 474
  • 3
  • 17
  • How is that a groupby? Sounds more like a filter. Also in your example data not a single row fulfills that condition, so how do you arrive at the expected result? Or did you mean "the same day of the month"? – a_guest Mar 24 '20 at 15:38
  • For first item `A`, the `end_date` is equal to the `start_date` of the second item `A` so we group by and give the resulting row the `start_date` of the first `A` and the `end_date` of the second `A`. – Rodwan Bakkar Mar 24 '20 at 15:41
  • I am investigating if this is an answer I need – Rodwan Bakkar Mar 24 '20 at 15:56
  • @a_guest this is almost a solution, but it does not work in my case because the item could be repeated for various dates intervals, like I updated in the question, and this solution will groupby all of them. – Rodwan Bakkar Mar 24 '20 at 17:30
  • @a_guest doing more tests, I made sure it works in all cases, this is the solution I needed, thank you! – Rodwan Bakkar Mar 24 '20 at 17:40

1 Answers1

0

I think sorting the DataFrame by values would do. It depends on your aim, too.

df = df.sort_values(by=['start_date', 'end_date']).reset_index(drop=True)

Then you can iterate on the sorted rows and apply a function which does what you need. I do not see a mapping solution to this.

Catalina Chircu
  • 1,506
  • 2
  • 8
  • 19