I finally got my script to submit PDF document to Google Storage and then extract Text using Google Vision for PDF, as described in documentation.
The data is returned in a huge JSON file. There's one node that contains test, but it's no longer formatted. Only line breaks are delineated with \n
. I don't really care so much about the line breaks, as much as paragraphs.
How can I return it formatted? Are there any libraries that would work with GCP to enhance JSON output?