I am trying to iterate through all tables in a document and extract the text from them. As an intermediate step I am just trying to print the text to the console.
I have looked at other code provided by scanny in similar posts but for some reason it is not giving me my expected output from the document I am parsing through
The document can be found at https://www.ontario.ca/laws/regulation/140300
from docx import Document
from docx.enum.text import WD_COLOR_INDEX
import os, re, sys
document = Document("path/to/doc")
tables = document.tables
for table in tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
print(paragraph.text)
I expect this to print out all the text but instead I get nothing. if I try to print(row.cells) it just prints (). which is an empty list I guess. My document definetly does have text in the cells though. Not sure whats wrong here.
Any help is appreciated,