I am trying to use ARCamera matrix to do the conversion of 3D point to 2D in ARkit/Scenekit. Previously, I used projectpoint
to get the projected x and y coordinate which is working fine. However, the app is significantly slowed down and would crash for appending long recordings.
So I turn into another approach: using the ARCamera parameter to do the conversion on my own. The Apple document for projectionMatrix did not give much. So I looked into the theory about projection matrix The Perspective and Orthographic Projection Matrix and Metal Tutorial. From my understanding that when we have a 3D points P=(x,y,z), in theory we should be able to just get the 2D point like so: P'(2D)=P(3D)*projectionMatrix.
I am assuming that's would be the case, so I did:
func session(_ session: ARSession, didUpdate frame: ARFrame) {
guard let arCamera = session.currentFrame?.camera else { return }
//intrinsics: a matrix that converts between the 2D camera plane and 3D world coordinate space.
//projectionMatrix: a transform matrix appropriate for rendering 3D content to match the image captured by the camera.
print("ARCamera ProjectionMatrix = \(arCamera.projectionMatrix)")
print("ARCamera Intrinsics = \(arCamera.intrinsics)")
}
I am able to get the projection matrix and intrinsics (I even tired to get intrinsics to see whether it changes) but they are all the same for each frame.
ARCamera ProjectionMatrix = simd_float4x4([[1.774035, 0.0, 0.0, 0.0], [0.0, 2.36538, 0.0, 0.0], [-0.0011034012, 0.00073593855, -0.99999976, -1.0], [0.0, 0.0, -0.0009999998, 0.0]])
ARCamera Intrinsics = simd_float3x3([[1277.3052, 0.0, 0.0], [0.0, 1277.3052, 0.0], [720.29443, 539.8974, 1.0]])...
I am not too sure I understand what is happening here as I am expecting that the projection matrix will be different for each frame. Can someone explain the theory here with projection matrix in scenekit/ARKit and validate my approach? Am I using the right matrix or do I miss something here in the code?
Thank you so much in advance!