I am new to Kalman filters and trying to use it for predicting missing values as well as getting smoothed observation from GPS data (latitude and longitude).
I am using pykalman and my code block looks like this:
data = data[['Lat', 'Lon']]
measurements = np.asarray(data, dtype='float')
measurements_masked = np.ma.masked_invalid(measurements)
# initial state of the form [x0, x0_dot, x1, x1_dot]
initial_state_mean = [
measurements[0, 0],
0,
measurements[0, 1],
0
]
initial_state_covariance = [[ 10, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1]]
# transition matrix to estimate new position given old position
transition_matrix = [
[1, 1, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 1],
[0, 0, 0, 1]
]
observation_matrix = [
[1, 0, 0, 0],
[0, 0, 1, 0]
]
kf = KalmanFilter(
transition_matrices=transition_matrix,
observation_matrices=observation_matrix,
initial_state_mean=initial_state_mean,
)
filtered_state_means = np.zeros((len(measurements), 4))
filtered_state_covariances = np.zeros((len(measurements), 4, 4))
for i in range(len(measurements)):
if i == 0:
filtered_state_means[i] = initial_state_mean
filtered_state_covariances[i] = initial_state_covariance
else:
filtered_state_means[i], filtered_state_covariances[i] = (
kf.filter_update(
filtered_state_means[i-1],
filtered_state_covariances[i-1],
observation = measurements_masked[i])
)
where data is a pandas dataframe from which latitude and longitude are extracted.
Is this logic correct? Also, what I want to do is to take observations which are closer to missing observation to predict missing values. For example, if, in an array of 10 samples, if 5th, 6th and 7th observations are missing, it makes more sense to predict 5th using 4th sample, predict 7th using 8th sample and predict 6th by taking an average of both 5th and 7th.
Does this approach make sense? If yes, how to do it using pykalman? If not, what can be done to predict missing values more accurately where a lot of consecutive values in an array are absent?