Let's say I have a set of 5 markers. I am trying to find the relative distances between each marker using an augmented reality framework such as ARToolkit. In my camera feed thee first 20 frames show me the first 2 markers only so I can work out the transformation between the 2 markers. The second 20 frames show me the 2nd and 3rd markers only and so on. The last 20 frames show me the 5th and 1st markers. I want to build up a 3D map of the marker positions of all 5 markers.
My question is, knowing that there will be inaccuracies with the distances due to low quality of the video feed, how do I minimise the inaccuracies given all the information I have gathered?
My naive approach would be to use the first marker as a base point, from the first 20 frames take the mean of the transformations and place the 2nd marker and so forth for the 3rd and 4th. For the 5th marker place it inbetween the 4th and 1st by placing it in the middle of the mean of the transformations between the 5th and 1st and the 4th and 5th. This approach I feel has a bias towards the first marker placement though and doesn't take into account the camera seeing more than 2 markers per frame.
Ultimately I want my system to be able to work out the map of x number of markers. In any given frame up to x markers can appear and there are non-systemic errors due to the image quality.
Any help regarding the correct approach to this problem would be greatly appreciated.
Edit: More information regarding the problem:
Lets say the realworld map is as follows:
Lets say I get 100 readings for each of the transformations between the points as represented by the arrows in the image. The real values are written above the arrows.
The values I obtain have some error (assumed to follow a gaussian distribution about the actual value). For instance one of the readings obtained for marker 1 to 2 could be x:9.8 y:0.09. Given I have all these readings how do I estimate the map. The result should ideally be as close to the real values as possible.
My naive approach has the following problem. If the average of the transforms from 1 to 2 is slightly off the placement of 3 can be off even though the reading of 2 to 3 is very accurate. This problem is shown below:
The greens are the actual values, the blacks are the calculated values. The average transform of 1 to 2 is x:10 y:2.