4

Question:

Is it possible to use a tracking algorithm on street level images like Google panoramic street views? It is possible to track over video and each video frame is equivalent to an image, but these images have 5 meters in between.

What I have tried:

I have tried the Deep Sort tracking algorithm but it is not accurate and mostly looses the objects. I couldn't find much information on how to track over a set of images instead of video on Google.

Note:

I have a directory full of panoramic images which were each taken 5 meters apart. I see the same objects in multiple images but am not able to track them.

enter image description here

enter image description here

Any help or guidance wis appreciated.

Salvatore
  • 10,815
  • 4
  • 31
  • 69
Profstyle
  • 434
  • 4
  • 20

2 Answers2

1

Tracking a set of images instead of video shouldn't be a problem, it's the same as a video with a slower frame rate. I think the most likely reason the matching is failing is due to fish eye distortion (a result of the 360 imaging) which you will need to remove before you can then match the signs.

360 degree cameras generally use 2 or more wide angle cameras with fish eye lenses to capture the photos and then stitch them together in software. While this gives satisfactory 360 degree images, fish eye lenses add a lot of positive radial distortion. This means that as the object you want to track moves through the field of view of the camera, it gets distorted and then no longer "looks" like the original object.

Usually you would have access to the original cameras and could perform camera calibration to get a camera and distortion matrix which you could then use to undistort your images as detailed in the OpenCV docs. This is a good place for more background on where distortion comes from and how to deal with it.

Without calibration parameters, there are a few things you could try:

Estimate Camera and Distortion Matrices

This answer on the Signal Processing StackExchange mentions how to do this:

Compute the homography using findHomography then use warpPerspective to warp your images

The full post has more detail on how to do this, but it is fairly straightforward and I've used it before with decent success. findHomography will give you the parameters you need to pass to warpPerspective to remove the distortion without knowing the intrinsic camera parameters.

If that doesn't work for some reason, you could try the following less sophisticated approach:

Estimate a fisheye (radial) distortion via trial and error and pass it to undistort to correct the images

This answer and this answer detail how to do this. You won't know the distortion parameters so you can try some and see what values have better or worse results. I would only try this if the first method doesn't work.

Extra Info

I found this slightly dated research paper that integrates undistortion into a fast tracking algorithm.

Salvatore
  • 10,815
  • 4
  • 31
  • 69
0

You need a filter type spheric like a your camera objective with advance or a kalman filter, kalman filter is algorithm to noise or error prediction,

you need transform spheric coordenate to 2 dimension , after you need use

kf = cv.KalmanFilter(4, 2)
kf.measurementMatrix = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], np.float32)
kf.transitionMatrix = np.array([[1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32)

def Estimate(self, coordX, coordY):
    ''' This function estimates the position of the object'''
    measured = np.array([[np.float32(coordX)], [np.float32(coordY)]])
    self.kf.correct(measured)
    predicted = self.kf.predict()
    return predicted
Adrian Romero
  • 537
  • 6
  • 13
  • Do you know why the Deep Sort algo is failing? Should be easy to determine the reason if it has decent monitoring/logging features, and not too hard to write if needed. – befunkt Mar 19 '20 at 09:51
  • 1
    And I don't know anything about the Deep Sort algo, but if it is a well known, powerful (and properly implemented) tool, I would guess that it is failing with Google Maps because those images are heavily distorted in order to create the interactive panoramic effect. If that's the case, the distortion can likely be corrected enough for your purposes, assuming the image distortion is in no way arbitrary. – befunkt Mar 19 '20 at 10:02