I am working with an eye-tracking data set, and for some reason after the column df['timestamp'] surpasses 1,000,000 values in the dataframe are rounded off to the next 100. This is problematic, because the eye-tracker stores a new datapoint roughly at each increase of 20.
I managed to find a solution that works for me, but I was wondering if there is a vectorized method that is more elegant?
# create a variable that tracks the difference in time
df['dt'] = (df['timestamp'] - df['timestamp'].shift(1))
# I want to keep the old timestamps, so I make a new column
df['new_timestamp'] = df['timestamp']
for i in range(1,6):
df['new_timestamp'] = np.where(df['dt'] == 0,
df['new_timestamp'] + 20,
df['new_timestamp'])
df['dt'] = (df['new_timestamp'] - df['new_timestamp'].shift(1))
Edit:
To be more precise, certain values have a pattern like this:
Current Corrected
5113100.0 5113100.0
5113100.0 5113120.0
5113100.0 5113140.0
5113100.0 5113160.0
5113100.0 5113180.0
5113200.0 5113200.0