Ok, I have a big dataframe such as:
hour value
0 0 1
1 6 2
2 12 3
3 18 4
4 0 5
5 6 6
6 12 7
7 18 8
8 6 9
9 12 10
10 18 11
11 12 12
12 18 13
13 0 14
Let's don't get lost here. The column hour
represents the hours of the day, from 6 to 6 hours. Column values
is well, exactly that, here the values are as an example, not the actual ones.
If you look closely to the hour
column, you can see that there are hours missing. For instance, there is a gap between rows 7 and 8 (the value of hour 0 is missing). There are also bigger gaps, such as in between rows 10 and 11 (hours 00 and 06).
What do I need? I would like to check when an hour (and of course) a value is missing, and complete the dataframe inserting a row there with the corresponding hour and a np.nan
as value.
What have I thought? I think this would be easily solved using modular arithmetic, in this case with mod 24, such as when 18 + 6 = 24 = 0 mod 24
. So initializing the counter to zero and adding 6 with the caveat that the counter is defined in modular arithmetic mod 24 you can verify if each hour
is the corresponding hour, and if not, insert a new row with the corresponding hour and with np.nan
as value.
I don't know how to do the implementation of modular arithmetic in python to iterate a dataframe column.
Thank you very much.