OpenCV has the functions estimateRigidTransform() and warpAffine() which handle this sort of problem really well.
Its pretty much as simple as this:
Mat M = estimateRigidTransform(frame1,frame2,0)
warpAffine(frame2,output,M,Size(640,480),INTER_NEAREST|WARP_INVERSE_MAP)
Now output
contains the contents of frame2
that is best aligned to fit to frame1
.
For large shifts, M will be a zero Matrix or it might not be a Matrix at all, depending on the version of OpenCV, so you'd have to filter those and not apply them. I'm not sure how large that is; maybe half the frame width, maybe more.
The third parameter to estimateRigidTransform is a boolean that tells it whether to also apply an arbitrary affine matrix or restrict it to translation/rotation/scaling. For the purposes of stabilizing an image from a camera you probably just want the latter. In fact, for camera image stabilization you might also want to remove any scaling from the returned matrix by normalizing it for only rotation and translation.
Also, for a moving camera, you'd probably want to sample M through time and calculate a mean.
Here are links to more info on estimateRigidTransform(), and warpAffine()