bytes per pixel, bytes per line - How to use function nativeSetImageBytes in tessbaseapi.cpp of tess-two?

Question

we are parsing an image showing a textsnippet which has a resolution of 2121x105 px. In Java we have the following code to get an byte array (one of our constraints is to work with a byte array here):

import org.apache.commons.io.IOUtils;

...

InputStream is = getAssets().open("images/text.png");
byte[] bytes = IOUtils.toByteArray(is);

This byte array is then passed to the native C++ code - we are not using the Java wrapper of tess-two, we use the native libraries though. In the native code we are trying to get the text of the image with GetUTF8Text(). Then we saw that tess-two has already an implementation for setting the image to read from by passing it as a byte array:

void Java_com_..._TessBaseAPI_nativeSetImageBytes(JNIEnv *env,
                                                  jobject thiz,
                                                  jlong mNativeData,
                                                  jbyteArray data,
                                                  jint width,
                                                  jint height,
                                                  jint bpp,
                                                  jint bpl) {

...

We figured that bpp for a PNG should be 4 (RGBA). It's not clear though what is is expected for bpl. If we set the width of the image muliplied by bpp then we get a segmentation error. If we set it to zero an empty string is returned.

UPDATE: The semgentation error is thrown in GetUTF8Text() and not in SetImage().

SIGSEGV (signal SIGSEGV: invalid address (fault address: 0xc))

I mean what I said. Your image is encoded with libpng and needs decoding before tesseract can use it. See java example [here](https://stackoverflow.com/questions/6444869/how-do-i-read-pixels-from-a-png-file) — Dmitrii Z., Jul 30 '18 at 13:38
Okay, thank you. I get now that I need to decode my jbyteArray, but then I will get another object type while the nativeSetImageBytes expects a byte array. To make it clear the function nativeSetImageBytes is an implementation of tess-two. — Alexander Belokon, Jul 30 '18 at 15:01
tess-two aka tesseract expects decoded image in rgba rgb or gray format. So you need to decode your png and convert result to byte array. bpp is bytes per pixel for rgba format it would be 4 (1 byte is red 2 is green 3 is blue 4 is alpha) for rgb it would be 3 (1 byte is red 2 is green 3 is blue) for grayscale it would be 1. bpl is bytes per line = bpp * image width — Dmitrii Z., Jul 30 '18 at 15:38
Would you like to write an answer to my question, so I can assign it as solved? — Alexander Belokon, Aug 01 '18 at 10:29

score 1 · Accepted Answer · answered Aug 01 '18 at 10:31

tess-two which uses tesseract OCR expects decoded image in rgba rgb or gray format.

So you need to decode your png (this question explains how to do it in java) and convert result to byte array.

bpp is bytes per pixel for rgba format it would be 4 (1 byte is red 2 is green 3 is blue 4 is alpha) for rgb it would be 3 (1 byte is red 2 is green 3 is blue) for grayscale it would be 1.

bpl is bytes per line = bpp * image width

bytes per pixel, bytes per line - How to use function nativeSetImageBytes in tessbaseapi.cpp of tess-two?

1 Answers1