This question somewhat builds on this post, wherein the idea is to take the ARMeshGeometry
from an iOS device with LiDAR scanner, calculate the texture coordinates, and apply the sampled camera frame as the texture for a given mesh, hereby allowing a user to create a "photorealistic" 3D representation of their environment.
Per that post, I have adapted one of the responses to calculate the texture coordinates, like so;
func buildGeometry(meshAnchor: ARMeshAnchor, arFrame: ARFrame) -> SCNGeometry {
let vertices = meshAnchor.geometry.vertices
let faces = meshAnchor.geometry.faces
let camera = arFrame.camera
let size = arFrame.camera.imageResolution
// use the MTL buffer that ARKit gives us
let vertexSource = SCNGeometrySource(buffer: vertices.buffer, vertexFormat: vertices.format, semantic: .vertex, vertexCount: vertices.count, dataOffset: vertices.offset, dataStride: vertices.stride)
// set the camera matrix
let modelMatrix = meshAnchor.transform
var textCords = [CGPoint]()
for index in 0..<vertices.count {
let vertexPointer = vertices.buffer.contents().advanced(by: vertices.offset + vertices.stride * index)
let vertex = vertexPointer.assumingMemoryBound(to: (Float, Float, Float).self).pointee
let vertex4 = SIMD4<Float>(vertex.0, vertex.1, vertex.2, 1)
let world_vertex4 = simd_mul(modelMatrix, vertex4)
let world_vector3 = simd_float3(x: world_vertex4.x, y: world_vertex4.y, z: world_vertex4.z)
let pt = camera.projectPoint(world_vector3, orientation: .portrait, viewportSize: CGSize(width: CGFloat(size.height), height: CGFloat(size.width)))
let v = 1.0 - Float(pt.x) / Float(size.height)
let u = Float(pt.y) / Float(size.width)
//let z = vector_float2(u, v)
let c = CGPoint(x: v, y: u)
textCords.append(c)
}
// Setup the texture coordinates
let textureSource = SCNGeometrySource(textureCoordinates: textCords)
// Setup the normals
let normalsSource = SCNGeometrySource(meshAnchor.geometry.normals, semantic: .normal)
// Setup the geometry
let faceData = Data(bytesNoCopy: faces.buffer.contents(), count: faces.buffer.length, deallocator: .none)
let geometryElement = SCNGeometryElement(data: faceData, primitiveType: .triangles, primitiveCount: faces.count, bytesPerIndex: faces.bytesPerIndex)
let nodeGeometry = SCNGeometry(sources: [vertexSource, textureSource, normalsSource], elements: [geometryElement])
/* Setup texture - THIS IS WHERE I AM STUCK
let texture = textureConverter.makeTextureForMeshModel(frame: arFrame)
*/
let imageMaterial = SCNMaterial()
imageMaterial.isDoubleSided = false
imageMaterial.diffuse.contents = texture!
nodeGeometry.materials = [imageMaterial]
return nodeGeometry
}
Where I am struggling is to determine if these texture coordinates are actually calculating properly, and subsequently, how I would sample the camera frame to apply the relevant frame image as the texture for that mesh.
The linked question indicated that converting the ARFrame
's capturedImage
(which is a CVPixelBuffer
) property to a MTLTexture
would be ideal for real-time performance, but it has become apparent to me that the CVPixelBuffer
is a YCbCr
image, whereas I believe I would need a RGB
image.
In my textureConverter
class, I am attempting to convert the CVPixelBuffer
to a MTLTexture
, but am unsure how to return a RGB
MTLTexture
;
func makeTextureForMeshModel(frame: ARFrame) -> MTLTexture? {
if CVPixelBufferGetPlaneCount(frame.capturedImage) < 2 {
return nil
}
let cameraImageTextureY = createTexture(fromPixelBuffer: frame.capturedImage, pixelFormat: .r8Unorm, planeIndex: 0)
let cameraImageTextureCbCr = createTexture(fromPixelBuffer: frame.capturedImage, pixelFormat: .rg8Unorm, planeIndex: 1)
/* How do I blend the Y and CbCr textures, or return a RGB texture, to return a single MTLTexture?
return ...
}
func createTexture(fromPixelBuffer pixelBuffer: CVPixelBuffer, pixelFormat: MTLPixelFormat, planeIndex: Int) -> CVMetalTexture? {
let width = CVPixelBufferGetWidthOfPlane(pixelBuffer, planeIndex)
let height = CVPixelBufferGetHeightOfPlane(pixelBuffer, planeIndex)
var texture: CVMetalTexture? = nil
let status = CVMetalTextureCacheCreateTextureFromImage(nil, textureCache, pixelBuffer, nil, pixelFormat,
width, height, planeIndex, &texture)
if status != kCVReturnSuccess {
texture = nil
}
return texture
}
In the end, I'm not entirely sure if I really need a RGB
texture vs. a YCbCr
texture, but I am still unsure how I would return the proper image for texturing (my attempts to return just the CVPixelBuffer
without worrying about the YCbCr
color space, by manually setting a texture format, results in a very bizarre looking image).