Translate Firebase MLKit Bounding box coordinates to screen view coordinates

Question

I am using the FirebaseVision Object detection to detect things from the CameraX camera preview. It is detecting things find but I am trying to draw the bounding box of the items detected over the camera preview. In doing so the bounding box that firebase gives back is not for the image itself not the preview view to they appear in the wrong place.

The image size that I get back from firebase is 1200x1600 and the preview size is 2425x1440

How do I translate the bounding boxes returned from firebase to the correct screen coordinates?

You can take a look at a good sample here https://github.com/javaherisaber/MLBarcodeScanner — Mahdi Javaheri, Sep 27 '22 at 09:37

score 10 · Accepted Answer · answered Mar 07 '20 at 00:36

What I ended up doing was I took the image size that the camera took, divided the width/height by the view width/height to get the scale size

if(isPortraitMode()){
    _scaleY = overlayView.height.toFloat() / imageWidth.toFloat()
    _scaleX = overlayView.width.toFloat() / imageHeight.toFloat()
}else{
    _scaleY = overlayView.height.toFloat() / imageHeight.toFloat()
    _scaleX = overlayView.width.toFloat() / imageWidth.toFloat()
}

Now that I have the scale I can then take the bounding box return by the firebase detector and translate the x and y coordinates by the scales

private fun translateX(x: Float): Float = x * _scaleX
private fun translateY(y: Float): Float = y * _scaleY

private fun translateRect(rect: Rect) = RectF(
    translateX(rect.left.toFloat()),
    translateY(rect.top.toFloat()),
    translateX(rect.right.toFloat()),
    translateY(rect.bottom.toFloat())
)

Which then gives you the scaled rect coordinates which you then draw on the screen

Based on this I created a github repo for barcode scanning and placing bounding box in right position, https://github.com/mtsahakis/barcode — mtsahakis, Aug 30 '20 at 17:50
@Ayyappa this assumes the preview view is being shown full screen, if your camera can crop or something then I doubt this will work — tyczj, Jun 17 '22 at 12:17

score 2 · Answer 2 · answered Oct 01 '21 at 14:32

2

Please see my answer in CameraX qrcode scanner detect wrong. Basically you can use CoordinateTransform to transform coordinates from one CameraX UseCase to another.

answered Oct 01 '21 at 14:32

Xi 张熹

10,492
18
58
86

score 2 · Answer 3 · answered Jan 06 '22 at 15:13

Thanks @tyczj,

Your answer help me find my solution, let me add if someone is using front camera like me for Face detection you need to inverte the x axis, example:

val previewSize = overlayView.width.toFloat()
val newLeft = if (isFrontCamera) previewSize - (rect.right * scaleX) else rect.left * scaleX
val newRight = if (isFrontCamera) previewSize - (rect.left * scaleX) else rect.right * scaleX

Translate Firebase MLKit Bounding box coordinates to screen view coordinates

3 Answers3

Linked