0

I am attempting to go through a word document and find a few specific tables among many tables. I know how to iterate through all tables using either the docx library or win32, found here. However, I need to access a few specific tables, not all of them.

These tables have headings, in the format of Table A.x.x-x Insert table summary. They are text headings above the tables, not within the tables themselves. These don't show up when I use doc.ListParagraphs from win32, however, so I can't successfully iterate through the tables in that manner.

I know the name of the table I need to access. There is unrelated text throughout the document. There aren't any blanket similarities in the tables I need to find, so I can't just look for a specific value in a certain cell or something like that.

Does anyone have suggestions on how to approach this? Preferably using win32 COM, but I'm open to any solutions.

2 Answers2

0

I think the collection you're looking for is doc.Paragraphs.

doc.ListParagraphs only returns paragraphs that have list formatting, like bullets or numbers.

There are other challenges involved, but that's the first mystery solved I believe :)

scanny
  • 26,423
  • 5
  • 54
  • 80
0

I figured out an answer, using this discussion. Thanks for the clarification on which win32 COM function to use!

From the discussion, I used the code for iter_block_items. I also made a list of all the table titles of the titles that I wanted, called listOfTables. I then used the following code, which outputs a dictionary, the keys being the title of the tables and the values being the tables themselves.

dox = docx.Document(path) count = False tables = {} for item in iter_block_items(dox): try: title = item.text if title in listOfTables: count = True except: if count == True: tables[str(title)] = item count = False print tables

If it comes upon a table, we go to the except case because a table has no attribute 'text'. Then, if count is true, aka if the previous paragraph contained a table title, then store the title and the table itself in a dictionary. This will pair the titles with the appropriate tables, and I'll have easy access to the table I need.