I have dataframe (df) as below. This is the output of merging two DF's per hour. The NA's are those hours the data did not arrive into system at that hour.
Vol Nu Date lat long
0 NA 2017-01-01 01 NA NA
2 NA 2017-01-01 02 NA NA
0 NA 2017-01-01 03 NA NA
0 NA 2017-01-01 07 NA NA
0 NA 2017-01-01 08 NA NA
0 2 2017-01-01 09 80.85243 26.78307
0 NA 2017-01-01 10 NA NA
0 NA 2017-01-01 11 NA NA
0 2 2017-01-02 01 80.90426 26.77535
2 NA 2017-01-02 02 NA NA
3 NA 2017-01-02 03 NA NA
0 NA 2017-01-02 04 NA NA
0 NA 2017-01-02 05 NA NA
0 2 2017-01-02 06 80.90426 26.77535
.
.
.
.
All I need to get the output as below, where depending on a particular day, I have to clone the data in any of the hour. Ex on 2017-01-01 09 hour, I have data, so I will expand it to rest of the hours of the same data and so on with other date 2017-01-02...
Expected Output
Vol Nu Date lat long
0 2 2017-01-01 01 80.85243 26.78307
2 2 2017-01-01 02 80.85243 26.78307
0 2 2017-01-01 03 80.85243 26.78307
0 2 2017-01-01 07 80.85243 26.78307
0 2 2017-01-01 08 80.85243 26.78307
0 2 2017-01-01 09 80.85243 26.78307
0 2 2017-01-01 10 80.85243 26.78307
0 2 2017-01-01 11 80.85243 26.78307
0 2 2017-01-02 01 80.90426 26.77535
2 1 2017-01-02 02 80.90426 26.77535
3 1 2017-01-02 03 80.90426 26.77535
0 1 2017-01-02 04 80.90426 26.77535
0 1 2017-01-02 05 80.90426 26.77535
0 1 2017-01-02 06 80.90426 26.77535
It would be very helpful if you can help me.
Note: the data set is pretty big... more than a million rows populated per day. I just showed few records above.