I have a bunch of docx documents that I am extracting text out of using python-docx. Extraction works out fine. However, I am having trouble getting hold of numbering for the paragraphs.
A word document could be:
- Some Header: This is the first paragraph.
- Second Header: This is the second paragraph.
I am iterating thru the paragraph like this:
print(paragraph.text)
print(paragraph.style)
print(paragraph._p.pPr.numPr.numId.val)
It prints the paragraph style as
_ParagraphStyle('List Paragraph') id: 2280433126816
which is good. it also prints the text correctly.
However, it always prints 1 for
print(paragraph._p.pPr.numPr.numId.val)