This is related to a previous question: Python pandas change duplicate timestamp to unique, hence the similar name of this one.
The additional requirement is to handle multiple duplicates per second and space them out evenly between second boundaries, i.e.
....
2011/1/4 9:14:00
2011/1/4 9:14:00
2011/1/4 9:14:01
2011/1/4 9:14:01
2011/1/4 9:14:01
2011/1/4 9:14:01
2011/1/4 9:14:01
2011/1/4 9:15:02
2011/1/4 9:15:02
2011/1/4 9:15:02
2011/1/4 9:15:03
....
Should become:
....
2011/1/4 9:14:00
2011/1/4 9:14:00.500
2011/1/4 9:14:01
2011/1/4 9:14:01.200
2011/1/4 9:14:01.400
2011/1/4 9:14:01.600
2011/1/4 9:14:01.800
2011/1/4 9:14:02
2011/1/4 9:14:02.333
2011/1/4 9:14:02.666
2011/1/4 9:14:03
....
I am stumped on how to deal with the variable number of duplicates.
I thought along the lines of a groupby()
, but couldn't get it right. I was thinking that this is a common enough use-case to have been solved already.