Why is it that calling an executable via subprocess.call gives different results to subprocess.run?
The output of the call method is perfect - all new lines removed, formatting of the document is exactly right, '-' characters, bullets and tables are handled perfectly.
Running exactly the same function with the run method however and reading the output from stdout completely throws the output. Full of '\n', 'Â\xad', '\x97', '\x8f' characters with spacing all over the place.
Here's the code I'm using:
Subprocess.CALL
result=subprocess.call(['/path_to_pdftotext','-layout','/path_to_file.pdf','-'])
Subprocess.RUN
result=subprocess.run(['/path_to_pdftotext','-layout','/path_to_file.pdf','-'],stdout=PIPE, stderr=PIPE, universal_newlines=True, encoding='utf-8')
I don't understand why the run method doesn't parse and display the file in the same way. I'd use call however I need to save the result of the pdftotext conversion to a variable (in the case of run: var = result.stdout).
I can go through and just identify all the unicode it's not picking up in run and strip it out but I figure there must just be some encoding / decoding settings that the run method changes.
EDIT
Having read a similarly worded question - I believe this is different in scope as I'm wanting to understand why the output is different.