How to read SharePoint Online (Office365) pdf and word files using Python

Asked Jun 14 '23 at 17:01

Active Jun 14 '23 at 21:17

Viewed 101 times

if extension == "pdf":
    # Read PDF file
    file_stream = io.BytesIO(file_path.read())
    pdf_reader = PyPDF2.PdfReader(file_stream)
    file_contents = ""
    for page in pdf_reader.pages:
        file_contents += page.extract_text()
elif extension == "docx":
    # Read Word file
    file_stream = io.BytesIO(file_path.read())
    doc = Document(file_stream)
    paragraphs = [p.text for p in doc.paragraphs]
    file_contents = "\n".join(paragraphs)

how can we read pdf file and wrod file data from sharepoint

edited Jun 14 '23 at 21:17

Eugene Astafiev

47,483
3
24
45

asked Jun 14 '23 at 17:01

PramoD19

Following below approach : wants to read data from pdf and word file. https://stackoverflow.com/questions/69488725/read-sharepoint-excel-file-with-python-pandas – PramoD19 Jun 14 '23 at 17:17
@KJ In my case we are not editing anything, we just want to read the pdf and word file if it is there. – PramoD19 Jun 16 '23 at 02:36

How to read SharePoint Online (Office365) pdf and word files using Python

0 Answers0