0

I am using pytesseract with tkinter. My current code is as follows

def imagetostring():
    filepath = filedialog.ask.openfilename(title = "Select File", filetypes = (("PNG", "*.png")))
    output = pytesseract.image_to_string(Image.open(filepath))
    fp = filepath[:-4]
    filepathn = fp + ".txt"
    subprocess.run("echo.>" + filepathn + " && " + "echo " + output + "> " + filepathn,shell=True)
    print(output)

What I am trying to do is use pytesseract to convert the image text to string and create a file of the same name with the .txt extension with the image text inside the file. However the txt file comes up empty. If I put a string inside of the output variable in the same process or if the variable contains a written out string it is successful however it doesn't work with the pytesseract.image_to_string. Yet when I print the output variable it does appear in the vscode output terminal.

Akosoan
  • 1
  • 1
  • You know that you can use `with open(, "w") as file: file.write()` instead of using `subprocess`. The `open` function is built in function – TheLizzard May 19 '21 at 14:16
  • 1
    I would assume that there are much better ways of retrieving the base name of a file rather than removing the last four characters to remove the extension too. – Compo May 19 '21 at 14:18
  • @Compo Given that `filepath` can only end in `.png` it isn't that bid of a deal to just remove it using `[:-4]`. From OP's code we can see that the only filetypes allowed by the `openfilename` are `*.png` – TheLizzard May 19 '21 at 14:22
  • I never said that it was a big deal @TheLizzard, just that I'd assume there are better ways of doing it. _(and probably no need to define an intermediate variable `fp` in order to further define one named `filepathn`)_ – Compo May 19 '21 at 14:40
  • @TheLizzard @Compo absolutely, using `[:-4]` was just the quickest thing for me to write at that moment and I will surely clean it up and probably use `os.path.basename()`. Even the variable `fp` is just something I wrote to save time – Akosoan May 19 '21 at 14:53
  • @TheLizzard thanks so much it worked immediately. However I am also returning an arrow at the end. I believe I have to use `text.replace('\f','')` from [link](https://stackoverflow.com/questions/65880086/how-to-remove-the-arrow-sign-coming-after-text-extraction-using-ocr-pytesseract) but not sure how exactly. – Akosoan May 19 '21 at 14:57
  • @Akosoan I have no idea how `pytesseract` works. But `data.replace("\f", "")` looks like a solution – TheLizzard May 19 '21 at 15:48
  • @TheLizzard Top Man. I'm going to really crack down and learn python properly to save the time of heroes like you. – Akosoan May 19 '21 at 19:39

0 Answers0