1

Hi I'm trying to implement dlib facial landmark detection in android. I know, but I need as many fps I can get. There are 2 issues that I face,

  • Conversion Chain
  • Resizing

Currently, I am getting the data from a preview callback set to a camera. It outputs a byte[] of a NV21 Image. Since dlib dont know image and only know array2d<dlib::rgb_pixel>, I need to conform the data to it. The implementation that I get uses bitmap, and when I try to use there code, I have a chain of conversion byte[]->bmp->array2d, I want to implement a byte[]->array2d conversion.

Now, I need to leverage the performance of dlib by manipulating the size of the image fed in to it. My use-case though doesn't involve small faces so I can down-scale the input image to boost performance, but lets say I am successful on making the byte[]->array2d conversion, how can I resize the image? Resizing in bitmap though have many fast implementations but I need to cut the bitmap involvement to extract more fps. I have an option on resizing the byte[] or the converted one array2d, but again... how? Im guessing its good to do the resizing after the conversion because it will now be operating on native and not on java.

Edit

The down-scaling should take the byte[](not the dlib::arrray2d) form as input since I need to do something on the down-scaled byte[].

So my final problem is to implement this on jni

byte[] resize(ByteArray img, Size targetSize);

and

dlib::array2d<rgb_pixel> convert(ByteArray img);
Hohenheim
  • 393
  • 2
  • 16
  • You can see how renderscript (GPU based number cruncher for Android) is used to convert camera YUV to RGB: https://stackoverflow.com/a/20788170/192373. Regarding resize, you can hopefully manage it in the same step if it's OK to use simple ratios only (e.g. 1280×720 ➡ 640×360). – Alex Cohn Sep 13 '17 at 03:52
  • I am not sure if dlib will produce good answers on smaller images, you need careful benchmarks. – Alex Cohn Sep 13 '17 at 04:25
  • Hi @AlexCohn, I already made the 2 methods. The converter is working good, but haven't saw the image yet, I just get the output landmarks from dlib, its giving me results. The resize, I just finish writing it, I hope you can help me review it. And yeah, its only a simple resize that preserves ratio. https://gist.github.com/novodimaporo/a13ab0ef03a61c0f47d518f0c82aee26 – Hohenheim Sep 13 '17 at 05:50
  • What time does dlib processing take for you for different image sizes? – Alex Cohn Sep 13 '17 at 06:42
  • Im getting ~110ms on 1024x768 @AlexCohn – Hohenheim Sep 13 '17 at 06:51
  • @AlexCohn I have good news. I was able to bring the calculation duration down to ~80ms by downscaling the img. – Hohenheim Sep 13 '17 at 07:28
  • Make sure you do all transformations and processing off the UI thread. – Alex Cohn Sep 13 '17 at 07:33

2 Answers2

1

This question helped me a lot, and made me understand the nv21 structure. Using the code in the question I was able to develop a converter from nv21 byte[] to array2d<rgb>.

What's left unsloved now is the resize.

Hohenheim
  • 393
  • 2
  • 16
0

Performing any resizing in Java is probably bad because of poor compiler optimization. Your most performant option CPU-wise would probably be to write a specialized NV12 resize in C++, then convert to RGB. Swapping the order may only be slightly slower though, and much easier to write.

Your other option is to do all this on the GPU using shaders. GPU are way faster at this sort of thing, but they are finnicky. You might need a CPU fallback anyway (if the GPU isn't available for whatever reason, not familiar with Android).

Asik
  • 21,506
  • 6
  • 72
  • 131
  • Well, yeah, I think I need to crash-off using shader language, it is fast but Its way far from c++ dlib (at least to the method that I know), and exporting data `java->gpu->java->jni` is too complex I guess, not so familiar with shader language. – Hohenheim Sep 11 '17 at 09:50