0

My team and I are reading pdf documents to autofill forms. To improve the quality of the extraction and enable the editing of images, we use openCV removing table formatting in addition to the use of image clipping. The project worked normally in the local environment, however when it was run on the test server we received the following error from openCV:

Could not initialize class nu.pattern.OpenCV$LocalLoader$Holder
    at nu.pattern.OpenCV$LocalLoader.getInstance(OpenCV.java:340)

Using Dependency Walker 2.2 I verified that three openCV dependencies are missing in the test environment: mf.dll, mfplat.dll and mfreadwrite.dll. I tried to add them manually without success.

According to the answer https://stackoverflow.com/a/1834905/53897 I should create a megajar or uberjar from the library but I have no idea how to do that. Unlike the case presented in the question, I don't need to generate a jar from the project but from the library I'm trying to run in the test environment. My main question is... How to import the library I want to use when in the test environment I don't have the dependency dlls?

Below are snippets of implementation and use of the library:

pom.xhtml:

<dependency>
            <groupId>org.openpnp</groupId>
            <artifactId>opencv</artifactId>
            <version>4.5.5-1</version>
        </dependency>

my nmethods that i use pdf reader and openCV:

 public String getTextopdfOCR(File file, List<String> searchMatches) throws Exception {
        PDDocument doc = PDDocument.load(file);
        PDFRenderer pdfRenderer = new PDFRenderer(doc);
        StringBuilder out = new StringBuilder();

        Tesseract tesseract = new Tesseract();
        tesseract.setDatapath(getCaminhoDiscoC() + "extraiTextoPDF");
        tesseract.setLanguage("por");
        tesseract.setOcrEngineMode(1);
        
        for (int page = 0; page < doc.getNumberOfPages(); page++) {
            BufferedImage bufferedImage = pdfRenderer.renderImageWithDPI(page, 180, ImageType.GRAY);
            
            File tempFile = File.createTempFile("imagem" + page, ".jpg"); //file.exists()
            ImageIO.write(bufferedImage, "jpg", tempFile);
            BufferedImage paginaBuffer = getLinesRemovedImage(tempFile, false);
            String result = tesseract.doOCR(paginaBuffer);
            out.append(result);
            tempFile.delete();

            //Verifica se há algum termo buscado e se o texto não possui nenhum dos termos
            if (searchMatches != null && !searchMatches.isEmpty()) {
                Optional<String> findFirst = searchMatches.stream().filter((termo) -> result.contains(termo)).findFirst();
                if (!findFirst.isPresent() && page > 0) {       //Se nao encontrar na 1a pagina continua 
                    break;
                }
            }
        }

        doc.close();
        return out.toString().toLowerCase().replaceAll(AplicacaoBean.CARACTERES_ESPECIAIS_REGEX, "")
                .replaceAll("\\s", " ");
    }

public BufferedImage getLinesRemovedImage(File imageFile, Boolean isGray) throws Exception {
        try {
            nu.pattern.OpenCV.loadLocally();
            Mat srcMat = Imgcodecs.imread(imageFile.getPath());
            Mat originalMat = new Mat();
            if (!isGray) {
                originalMat = srcMat.clone();
            }
            Mat gray = new Mat();
            Imgproc.cvtColor(srcMat, gray, Imgproc.COLOR_BGR2GRAY);

            Imgproc.threshold(gray, gray, 0, 255, Imgproc.THRESH_BINARY_INV + Imgproc.THRESH_OTSU);

            Mat horizontal_kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(40, 1));
            Mat remove_horizontal = new Mat();
            for (int i = 0; i < 2; i++) {
                Imgproc.morphologyEx(gray, remove_horizontal, Imgproc.MORPH_OPEN, horizontal_kernel);
            }

            List<MatOfPoint> contours = new ArrayList<>();
            Imgproc.findContours(remove_horizontal, contours, new Mat(), Imgproc.RETR_EXTERNAL,
                    Imgproc.CHAIN_APPROX_SIMPLE);

            Scalar color = new Scalar(255, 255, 255);

            Imgproc.drawContours(originalMat, contours, -1, color, 2);


            Mat vertical_kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(1, 40));
            Mat remove_vertical = new Mat();
            for (int i = 0; i < 2; i++) {
                Imgproc.morphologyEx(gray, remove_vertical, Imgproc.MORPH_OPEN, vertical_kernel);
            }
            List<MatOfPoint> vertical_contours = new ArrayList<>();
            Imgproc.findContours(remove_vertical, vertical_contours, new Mat(), Imgproc.RETR_EXTERNAL,
                    Imgproc.CHAIN_APPROX_SIMPLE);

            Imgproc.drawContours(originalMat, vertical_contours, -1, color, 2);

            BufferedImage bufferedImage;

            bufferedImage = Mat2BufferedImage(originalMat);
            return bufferedImage;

        } catch (IOException e1) {
            e1.printStackTrace();
        }
        return null;
    }

    private BufferedImage Mat2BufferedImage(Mat matrix) throws Exception {
        MatOfByte mob = new MatOfByte();
        Imgcodecs.imencode(".jpg", matrix, mob);

        byte ba[] = mob.toArray();
        BufferedImage bi = ImageIO.read(new ByteArrayInputStream(ba));
        return bi;
    }

Enviroment settings:

apache-tomcat-9.0.33
JDK 15
Netbeans 15
Hérick Raposo
  • 303
  • 2
  • 7
  • 1
    ` mf.dll, mfplat.dll and mfreadwrite.dll` -- those are all from the windows media foundation kit – berak Sep 30 '22 at 13:00
  • Thanks. My windows server version is 2012r2 – Hérick Raposo Sep 30 '22 at 13:11
  • so you're looking for a way to reinstall those on your server – berak Sep 30 '22 at 13:23
  • another idea: do you actually NEED the MSMF based videoCapture ? did you *build* your own java opencv sdk ? you could disable msmf support by adding `WITH_MSMF=OFF` in cmake, and hopefully those deps are gone – berak Sep 30 '22 at 13:25
  • 1
    Your first comment already helped me and it worked. After your answer I googled: Install Media Foundation on Windows Server 2012, 2012 R2 and clicked on the first video, it showed how to install the feature. After installation everything went fine. – Hérick Raposo Sep 30 '22 at 14:00
  • fine, good luck then ;) – berak Sep 30 '22 at 14:06

0 Answers0