this is my first time posting here, I want to write a script that takes a docx as input and selects certain paragraphs(including tables and images) to copy in the same order into another template document(not at the end). The problem I'm having is when I start iterating over the elements my code is unable to detect the images, therefore I'm unable to determine where an image is relative to the text and tables nor which image is it. In short I got doc1 with: TEXT IMAGE TEXT TABLE TEXT
and what I end up with is: TEXT [IMAGE MISSING] TEXT TABLE TEXT
What I got so far:
-I can iterate over the paragraphs and tables:
def iter_block_items(parent):
"""
Generate a reference to each paragraph and table child within *parent*,
in document order. Each returned value is an instance of either Table or
Paragraph. *parent* would most commonly be a reference to a main
Document object, but also works for a _Cell object, which itself can
contain paragraphs and tables.
"""
if isinstance(parent, _Document):
parent_elm = parent.element.body
# print(parent_elm.xml)
elif isinstance(parent, _Cell):
parent_elm = parent._tc
else:
raise ValueError("something's not right")
for child in parent_elm.iterchildren():
if isinstance(child, CT_P):
yield Paragraph(child, parent)
elif isinstance(child, CT_Tbl):
yield Table(child, parent)
I can get an ordered list of the images of a document:
pictures = []
for pic in dwo.inline_shapes:
if pic.type == WD_INLINE_SHAPE.PICTURE:
pictures.append(pic)
I can insert at the end of a paragraph an specific image:
def insert_picture(index, paragraph):
inline = pictures[index]._inline
rId = inline.xpath('./a:graphic/a:graphicData/pic:pic/pic:blipFill/a:blip/@r:embed')[0]
image_part = dwo.part.related_parts[rId]
image_bytes = image_part.blob
image_stream = BytesIO(image_bytes)
paragraph.add_run().add_picture(image_stream, Inches(6.5))
return
I use the function iter_block_items() like this:
start_copy = False
for block in iter_block_items(document):
if isinstance(block, Paragraph):
if block.text == "TEXT FROM WHERE WE STOP COPYING":
break
if start_copy:
if isinstance(block, Paragraph):
last_paragraph = insert_paragraph_after(last_paragraph,block.text)
elif isinstance(block, Table):
paragraphs_with_table.append(last_paragraph)
tables_to_apppend.append(block._tbl)
if isinstance(block, Paragraph):
if block.text == ""TEXT FROM WHERE WE START COPYING":
start_copy = True