0

I am trying to automate an robotic arm using MATLAB. So the thing is, camera will be mounted on the base of robotic arm. It will capture the snapshots of the camera frames and do image processing on it and it will detect the target(object) and find the pixel coordinate for it. After it is done, this pixel coordinates should be mapped to real world x-y-z metric coordinates? and this real world coordinates(x,y,z) would serve as parameters for inverse kinematics function which would give the value of theta so that servos can move.

I'm stuck here, this pixel coordinates should be mapped to real world x-y-z metric coordinates? i dont know how to do it. nor getting any ideas, how to proceed it? Anyone having any lead, please share it!!

PS anyone of you guyz thinks that for automation of robotic arm should i use MATLAB or something else?. Coz this all code would be uploaded on raspberry pi 3 using ROS environment.

BEST REGARDS

hitesh kumar

Hitesh
  • 21
  • 8
  • have you seen this https://stackoverflow.com/q/9124731/5025009 ? – seralouk Jul 15 '18 at 09:11
  • and this http://make3d.cs.cornell.edu/index.html ? – seralouk Jul 15 '18 at 09:12
  • @seralouk I have seen them but some says that you need estimate camera intrinsics, extrinsics, and lens distortion parameters. – Hitesh Jul 15 '18 at 09:36
  • Is this more of a math problem than coding? What you are looking for is an algorithm to translate 2D coordinates to 3D. In terms of software, MATLAB cannot be installed on RPi, but you can control and acquire data from RPi through MATLAB (see [documentation](https://uk.mathworks.com/hardware-support/raspberry-pi-matlab.html)). You can also try [Octave](https://www.gnu.org/software/octave/), which is a MATLAB clone, free to use. – Anthony Jul 15 '18 at 14:31
  • 1
    A single image without further information is not enough to obtain 3D information. If you have more knowledge, for example about the size of the target, or an AR marker then it is possible to do this with a single image. Or if you can move the camera around, assuming the relevant environment is static, you can basically get an stereo graphic view and use triangulation after feature detection. – fibonatic Jul 19 '18 at 06:41

1 Answers1

1

To calculate 3D world point for the given pixel in an image, you need depth information (should use 3D camera like Kinect ..etc). One you have depth information and camera intrinsics and extrinsics you can convert 2D pixels to 3D world coordinates and vice versa.

Below equation for calcualting X,Y,Z in world coordinates.

enter image description here

Technically Perspective projection is what camera does in converting 3D world to 2D and below equation represents this projection.

enter image description here

Above image is from here

nayab
  • 2,332
  • 1
  • 20
  • 34