2

I am trying to translate image using OCR, but the watermark is in the way. is there any way to remove orange watermark the picture or at least make it lighter? Also is it possible to do it in bulk (to all images in folder).

enter image description here

Here is picture of what it should look like after watermark removed. just example. enter image description here

Sneaky Polar Bear
  • 1,611
  • 2
  • 17
  • 29
mp96
  • 53
  • 7

1 Answers1

3

You could probably just threshold, but cutting off the smoothing on low pixel count text can actually damage subsequent ocr pretty badly. So instead I created a mask that would kill the watermark and then applied it to the original image (this pulls the grey text boundary as well). Another trick that helped was to use the red channel since the watermark is most saturated on red ~245). Note that this requires opencv and c++17

#include <stdio.h>
#include <opencv2/opencv.hpp>
#include <Windows.h>
#include <string>
#include <filesystem>

namespace fs = std::filesystem;

using namespace std;
using namespace cv;

int main(int argc, char** argv)
{
    bool debugFlag = true;
    std::string path = "C:/Local Software/voyDICOM/resources/images/wmTesting/";
    for (const auto& entry : fs::directory_iterator(path))
    {
        std::string  fileName = entry.path().string();
        Mat original = imread(fileName, cv::IMREAD_COLOR);
        if (debugFlag) { imshow("original", original); }
        Mat inverted;
        bitwise_not(original, inverted);
        std::vector<Mat> channels;
        split(inverted, channels);

        for (int i = 0; i < 3; i++)
        {
            if (debugFlag) { imshow("chan" + std::to_string(i), channels[i]); }
        }

        Mat bwImg;
        cv::threshold(channels[2], bwImg, 50, 255, cv::THRESH_BINARY);
        if (debugFlag) { imshow("thresh", bwImg); }

        Mat outputImg;
        inverted.copyTo(outputImg, bwImg);

        bitwise_not(outputImg, outputImg);
        if (debugFlag) { imshow("output", outputImg); }

        if (debugFlag) { waitKey(0); }
        else { imwrite(fileName, outputImg); }
    }
}

Image showing the benefit of masking over just thresholding: enter image description here

ref: How can I get the list of files in a directory using C or C++?


Edit (added debugFlag to aid in debugging), debug output sample: enter image description here

Sneaky Polar Bear
  • 1,611
  • 2
  • 17
  • 29
  • i have run your code in visual studio debug x64 i get black image background with only watermark. I have attached image and debug log. Thank you. – mp96 Sep 21 '21 at 23:28
  • 1
    I added a debugFlag and some image outputs to try and help you with your bug. Unfortunately, I was unable to reproduce anything similar to the negative image you showed. Could you run this debug code and take a screenshot with all the panels similar to how I have them setup at the bottom of this post? – Sneaky Polar Bear Sep 22 '21 at 01:29
  • it all worked now. Thanks for help but is there any way to save all (output) images that may be converted in folder to some new folder x where all watermark removed picture are saved. – mp96 Sep 22 '21 at 22:14
  • also i have updated new image is there any way to mask it similarly and remove watermark in this new image watermark is similar text but a little bit thin text on top of image. I have updated image in main post. Another thing i found is that it only remove watermark from part of image top only, rest of image is clipped/removed(bottom part). is there any way to do it for whole image(remove watermark not just top)? – mp96 Sep 22 '21 at 22:37
  • I apologize, but the scope of the question seems to be shifting at this point. I am glad that you were able to get the code working, but think that at this point you should probably close this question (accept an answer) and create a new question or questions if you have additional issues. – Sneaky Polar Bear Sep 23 '21 at 00:47
  • To answer your question, yes, just create a new folder and change the path and name of the files in the imwrite call accordingly. – Sneaky Polar Bear Sep 23 '21 at 00:48
  • Hmm that last bit (that it is only operating on the tops of images is interesting). To be honest I tested this code only on images much shorter than you originally posted. I will see if I can reproduce this issue and get back to you. – Sneaky Polar Bear Sep 23 '21 at 00:50
  • I was unable to reproduce your clipping/partial processing of larger images. What is the dimension of the image that you are seeing this effect on? (px or file size is fine) – Sneaky Polar Bear Sep 23 '21 at 17:58