6

I want to pass an image object from my c# project to my python script however from my understanding whatever there is in the arguments it is considered as string and also when I try type(passedImage) in python it identifies it as a string even if I try to put a number instead of the image variable.

ProcessStartInfo start = new ProcessStartInfo();
            start.FileName = @"C:\Python\Python36\python.exe";
            start.Arguments = string.Format("{0} {1}", @"C:\OCRonImage2.py", image );
            start.UseShellExecute = false;
            start.RedirectStandardOutput = true;
            start.CreateNoWindow = true;
            using (Process process = Process.Start(start))
            {

            }
cylegend
  • 163
  • 1
  • 2
  • 12
  • 2
    pass either path to image or fx image as base64 string obviosuly you should modify python script to accept those ... 3rd option let python script read image bytes from standard input and write image byte there using Process.StandardInput – Selvin Dec 13 '19 at 16:25
  • I prefer not to pass the path to the image as I am creating the image and I don't want to save it somewhere so this will be my last solution if I am not able to solve this. I will search into the base64 string you suggested – cylegend Dec 13 '19 at 16:27
  • 3
    3rd option is easy ... on python you should read image from `sys.stdin.buffer` (in the same way as it would be a file opened with `open()`) and on C# side use `process.Write(imageBytes, 0, imageBytes.Length)` – Selvin Dec 13 '19 at 16:38
  • Thank you guys, I manage to change the image to a base64 string, however now I get an error saying "System.ComponentModel.Win32Exception: 'The filename or extension is too long'" because the resulting string is too long. Is there any way around this? – cylegend Dec 13 '19 at 16:47
  • than go for 3 ... – Selvin Dec 13 '19 at 16:48
  • I still get the error "System.ComponentModel.Win32Exception: 'The filename or extension is too long'" – cylegend Dec 16 '19 at 11:39

4 Answers4

3

When executing OCRonImage2.py manually, is it an image file location that you would pass as an argument? I would be surprise if you would pass in a stream from the command line. It is no surprise that attempting to put the entire image's bytes into an argument would create a string too long. But with the error you reported, I would also believe that the python script was expecting a file path to the image. However, if you look at that python code, I wouldn't be surprised if you find it using the filepath argument to open the file, probably using Image.open(filepath,mode=r). Mode is optional, r is the default.

You are in luck however, Image.open also takes a stream. If you are willing to modify the python code there are two options:

  1. Try converting the argument to a stream object, since the argument is a string maybe use io.StringIO()
  2. Use input() instead of the argument passed, then you could redirect the input of the process and stream the file into your python.
    ProcessStartInfo start = new ProcessStartInfo();
    start.FileName = @"C:\Python\Python36\python.exe";
    start.Arguments = string.Format("{0}", @"C:\OCRonImage2.py");
    start.UseShellExecute = false;
    start.RedirectStandardOutput = true;
    start.RedirectStandardInput = true;
    start.CreateNoWindow = true;
    using (Process process = Process.Start(start))
    {
        StreamWriter streamWriter = process.StandardInput;
        streamWriter.Write({imageString});
        // ...
    }

Be sure the way you encode imageString the same as the decoding is performed in the python script.

Hopefully one of these solutions will work for you.

Larry Dukek
  • 2,179
  • 15
  • 16
3

As I work with the Anaconda distribution of Python, in my tests on an isolated conda environment, the OCR is successful with pytesseract through a Python script, on a test image.

Prerequisites to test:

  • install Anaconda and create an env called py3.7.4: conda create --name py3.7.4
  • activate the env with conda activate py3.7.4
  • install pytesseract with conda install -c conda-forge pytesseract
  • create a folder called Test and place a jpg file called ocr.jpg with the following sample image: enter image description here
  • in the same Test folder also place a Python script called ocr_test.py with the following code:

    import pytesseract
    from PIL import Image
    import argparse
    
    parser = argparse.ArgumentParser(
        description='perform OCR on image')
    parser.add_argument("--path", "-p", help="path for image")
    args = parser.parse_args()
    print(pytesseract.image_to_string(Image.open(args.path)))
    print("done")
    

The above snippet accepts the image path as a command line argument. The --path flag must be specified in order to pass the image path as an arg.

Now, in the C# code snippet below, we will:

  • launch the cmd shell
  • navigate to the workingDirectory Test folder by specifying the WorkingDirectory arg for the process.start() method.
  • activate Anaconda with the anaconda.bat file(replace the file path as per its location on your computer)
  • activate the above conda environment
  • call the Python script passing the imageFileName as an arg.

C# snippet:

using System.Diagnostics;
using System.Threading;


namespace PyTest
{
    class Program
    {

        static void Main(string[] args)
        {
            string workingDirectory = @"C:\Test";
            string imageFileName = "ocr.JPG";

            var process = new Process
            {                
                StartInfo = new ProcessStartInfo
                {
                    FileName = "cmd.exe",
                    RedirectStandardInput = true,
                    UseShellExecute = false,
                    RedirectStandardOutput = false,
                    WorkingDirectory = workingDirectory
                }

            };
            process.Start();


            using (var sw = process.StandardInput)
            {
                if (sw.BaseStream.CanWrite)
                {
                    // Vital to activate Anaconda
                    sw.WriteLine(@"C:\Users\xxxxxxx\Anaconda3\Scripts\activate.bat");
                    Thread.Sleep(500);
                    // Activate your environment
                    sw.WriteLine("conda activate py3.7.4");
                    Thread.Sleep(500);
                    sw.WriteLine($"python ocr_test.py --path {imageFileName}");
                    Thread.Sleep(50000);
                }                  

            }           
        }
    }   
    }

If you have followed the above steps, you should receive the following output on executing the C# snippet in Visual Studio:

Output:

Microsoft Windows [Version 10.0.18362.535]
(c) 2019 Microsoft Corporation. All rights reserved.

C:\xxxxxxx\Projects\Scripts>C:\Users\xxxxx\Anaconda3\Scripts\activate.bat

(base) C:\xxxxxx\Projects\Scripts>conda activate py3.7.4

(py3.7.4) C:\xxxxxxx\Projects\Scripts>python ocr_test.py --path ocr.JPG
Introduction

This is a test to see accuracy of Tesseract OCR
Test 1

Test 2
done

Note: I am unable to test with a standalone Python distro but I believe it should work just fine with that too. The key is to pass the image file path as an argument to the Python script too. That way, the image file path passed as argument from C# is treated similarly by Python too. Also, using Image.open() does the following(from the docs):

Opens and identifies the given image file. This is a lazy operation; this function identifies the file, but the file remains open and the actual image data is not read from the file until you try to process the data

amanb
  • 5,276
  • 3
  • 19
  • 38
2

I think this will be bad to pass IMAGE as Argument.

Good options to go with:

CorrM
  • 498
  • 6
  • 18
  • Stdin process doesnt finish since the base64 image string has 900k characters. I cannot use TCP because there will be no internet and sharing memory can be used for files which I try to prevent creating one, but in the end I think I will just have to save the image to storage and then send the path to the saved image – cylegend Dec 17 '19 at 12:54
  • TCP dosn't need internet. Just Local Server and Local Client. – CorrM Dec 17 '19 at 13:25
  • as what u say i think TCP is the best method for u. – CorrM Dec 17 '19 at 13:25
2

You can save the image as a file somewhere on your local machine and give the python program the path to read it.

That's the easiest way I think you can do.

Edited: You can use a temporary file to make sure the file can be deleted in the future

Võ Quang Hòa
  • 2,688
  • 3
  • 27
  • 31