I have a data-frame which has columns like:
colA colB colC colD colE flag
A X 2018Q1 500 600 1
A X 2018Q2 200 800 1
A X 2018Q3 100 400 1
A X 2018Q4 500 600 1
A X 2019Q1 400 7000 0
A X 2019Q2 1500 6100 0
A X 2018Q3 5600 600 1
A X 2018Q4 500 6007 1
A Y 2016Q1 900 620 1
A Y 2016Q2 750 850 0
A Y 2017Q1 750 850 1
A Y 2017Q2 750 850 1
A Y 2017Q3 750 850 1
A Y 2018Q1 750 850 1
A Y 2018Q2 750 850 1
A Y 2018Q3 750 850 1
A Y 2018Q4 750 850 1
A row at colA, colB
level passes a statistical check if at colA, colB
level the value of flag==1
for continuous 4 quarters
of data after sorting for one stride.
We have to stride like this: 2018Q1-2018Q4 then 2018Q2-2019Q1 ....
so on if there is 4 continuous quarter and flag==1 then we lable that as 1.
The final output will be like:
colA colB colC colD colE flag check_qtr
A X 2018Q1 500 600 1 1
A X 2018Q2 200 800 1 1
A X 2018Q3 100 400 1 1
A X 2018Q4 500 600 1 1
A X 2019Q1 400 7000 0 0
A X 2019Q2 1500 6100 0 0
A X 2018Q3 5600 600 1 0
A X 2018Q4 500 6007 1 0
A Y 2016Q1 900 620 1 0
A Y 2016Q2 750 850 0 0
A Y 2017Q1 750 850 1 0
A Y 2017Q2 750 850 1 0
A Y 2017Q3 750 850 1 0
A Y 2018Q1 750 850 1 1
A Y 2018Q2 750 850 1 1
A Y 2018Q3 750 850 1 1
A Y 2018Q4 750 850 1 1
- How can we do this using pandas and numpy?
- Can we implemet this is using sql?