0

the report pdf like this

And I want to read the information in it and convert into a class or another object

this is my code

def GetUrl(self):
    auth = qiniu.Auth(self.__access_key, self.__secret_key)
    base_url = "84320875399.pdf"  
    private_url = auth.private_download_url(base_url)

    return private_url
 
def DownloadPdf(self,url:str):

    response = requests.get(url)
    pdf_data = response.content

    return pdf_data
 
def TryParsePdf(self,data):

    pdf_file = io.BytesIO(data)
    pdf_reader = PdfReader(pdf_file)

    for page in pdf_reader.pages:
        # print(page.extract_text())

        text=page.extract_text(0)
        rows = text.split('\n')
        for row in rows:
            print(row)

def PrasePdf(self):
    url=self.GetUrl()
    
    data=self.DownloadPdf(url)

    if self.TryParseDef(data):
       print("success")

If pypdf2 does not implement this function, please tell me what library can I choose

Help me!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958

0 Answers0