0

I'm new to coding with python. I was able to utilize code from other developers to read the tables in a .docx file but what I want to do is drill into the .docx and only read specific tables and rows and input into excel spreadsheet.

from docx.api import Document
import pandas as pd

document = Document("File Name")
tables =pd.DataFrame()
Tables = [0]

   for table in document.tables:
      for row in table.rows:
      text = [cell.text for cell in rows.cells]
   df = df.append([text], ignore_index = True)

df.columns = ["Column1","Column2","Column3",etc...]
df.T.to_excel("File Name. xlsx")

I get all the tables (22 tables) in the excel file but what I need is only tables 3-9. With only 4 rows.

Timus
  • 10,974
  • 5
  • 14
  • 28
  • Thank you Michael for the quick reply. But could you help with the first problem only extracting data from 8 tables. – 3290_ali Mar 05 '23 at 19:59
  • document.tables is a list. You can use python slicing syntax i.e. document.tables[3:9]. More details can be found here: https://stackoverflow.com/a/509295/1390927 – digby280 Mar 06 '23 at 16:54

0 Answers0