I want to select rows after the first non NaN values for each group ["etf_ticker", "ticker"] in my dataset:
trade_date,etf_ticker,ticker,buy_sell,quantity,weighting,open,low,high,close,volume,spy_close,qqq_close,10YUSD_close
2021-02-05,ARKF,1833.HK,,,,112.5,102.5,112.5,104.900002,9714617.0,386.444305,330.942017,1.17
2021-02-09,ARKF,1833.HK,,,,103.900002,101.900002,105.900002,104.5,5574818.0,388.976013,333.089325,1.157
2021-02-10,ARKF,1833.HK,Buy,497800.0,0.1854,107.800003,104.0,108.0,105.699997,3336335.0,388.806549,332.330261,1.133
2021-02-11,ARKF,1833.HK,Buy,2098800.0,0.7583,127.0,127.0,127.0,127.0,0.0,389.434509,334.157928,1.158
2021-02-18,ARKF,1833.HK,,,,139.0,130.600006,140.600006,134.600006,9742566.0,389.444489,332.050629,1.287
2021-03-11,ARKF,1833.HK,Buy,965000.0,0.2978,95.949997,95.0,98.849998,98.400002,6335100.0,392.2453,317.638824,1.527
2021-03-29,ARKF,1833.HK,,,,94.099998,93.900002,100.599998,99.050003,5850980.0,394.730011,314.320007,1.726
2021-03-31,ARKF,1833.HK,,,,97.800003,97.599998,103.800003,103.300003,6723838.0,400.609985,324.570007,1.679
2021-03-26,ARKF,AAPL,,,,120.349998,118.919998,121.480003,121.209999,93958900.0,395.980011,316.0,1.66
2021-03-28,ARKF,AAPL,,,,121.650002,120.730003,122.580002,121.389999,80819200.0,395.779999,315.910004,1.721
2021-03-29,ARKF,AAPL,Sell,77439.0,0.245,120.110001,118.860001,120.400002,119.900002,85671900.0,394.730011,314.320007,1.726
2021-03-30,ARKF,AAPL,,,,121.650002,121.150002,123.519997,122.150002,118323800.0,396.329987,319.130005,1.746
the expected result is:
trade_date,etf_ticker,ticker,buy_sell,quantity,weighting,open,low,high,close,volume,spy_close,qqq_close,10YUSD_close
2021-02-10,ARKF,1833.HK,Buy,497800.0,0.1854,107.800003,104.0,108.0,105.699997,3336335.0,388.806549,332.330261,1.133
2021-02-11,ARKF,1833.HK,Buy,2098800.0,0.7583,127.0,127.0,127.0,127.0,0.0,389.434509,334.157928,1.158
2021-02-18,ARKF,1833.HK,,,,139.0,130.600006,140.600006,134.600006,9742566.0,389.444489,332.050629,1.287
2021-03-11,ARKF,1833.HK,Buy,965000.0,0.2978,95.949997,95.0,98.849998,98.400002,6335100.0,392.2453,317.638824,1.527
2021-03-29,ARKF,1833.HK,,,,94.099998,93.900002,100.599998,99.050003,5850980.0,394.730011,314.320007,1.726
2021-03-31,ARKF,1833.HK,,,,97.800003,97.599998,103.800003,103.300003,6723838.0,400.609985,324.570007,1.679
2021-03-29,ARKF,AAPL,Sell,77439.0,0.245,120.110001,118.860001,120.400002,119.900002,85671900.0,394.730011,314.320007,1.726
2021-03-30,ARKF,AAPL,,,,121.650002,121.150002,123.519997,122.150002,118323800.0,396.329987,319.130005,1.746
I have checked a few example but not sure how to apply them in a groupbylast_valid_index
I could loop through the dataset for each group and create a new dataframe but was wondering if there was a better way. tx