So I have come across a few posts that deal with converting PDF's to HTML or converting them to text, however they all deal with doing so from a file saved to the computer. Is there a way to extract the text from a webpage PDF without downloading the PDF file itself (as I will be doing so for a large number of files by iterating through a list of URL's)?
I am also curious which is the best library to achieve this with. pdfkit, pdf2txt, pdfminer, etc.?
Here is an example website with the format I will be dealing with: http://www.arkansasrazorbacks.com/wp-content/uploads/2017/02/Miami-Ohio-Game-2.pdf