As far as I'm aware there is no support for this kind of stitching as of yet. In order to perform a successful stitching with this type of image, you first have to map the pixel coordinates for each lens onto a sphere - each giving a partial result - and then use some form of feature-matching algorithm (usually a combination of SIFT and RANSAC) that will accept polar coordinates (longitude and latitude) to compare the results. It's usual to then map the stitched results back into a 2D plane using something like a Mercator projection.
I have created manual stitching filters for this kind of image myself and can tell you that there are a great many different configurable elements in the equation that will depend on the equipment used, not least of which is the view angle of the lens.
Comparing 2D coordinates on fisheye lenses will rarely give good results when the matched areas are on the periphery of the lens. By way of demonstration, look at the computer mouse in the image you've given. On the right its general profile is flat and lozenge-like. On the left, it's distinctly banana-shaped.
You may get some traction by first converting both images into squares using a fisheye removal filter, although you'll probably find you have little control over whether the images are stitched side by side, one above the other, or at some strange angle where they meet over a corner.
Here's a document you may find useful for the purpose:
https://pdfs.semanticscholar.org/9616/0d2df798a8c4de08fd669b1d091f519b3fe8.pdf