0
from docx.api import Document
import pandas as pd
    
document = Document("D:/tmp/test.docx")
tables = document.tables
df = pd.DataFrame()

for table in document.tables:
    for row in table.rows:
        text = [cell.text for cell in row.cells]
        df = df.append([text], ignore_index=True)

df.columns = ["Column1", "Column2"]    
df.to_excel("D:/tmp/test.xlsx")
print df

Output

`>>> 
  Column1 Column2
0   Hello    TEST
1     Est    Ting
2      Gg      ff

How to remove row and column 0,1,2 and how to add some images in this codes?

  • 1
    Does this answer your question? [How to remove index from a created Dataframe in Python?](https://stackoverflow.com/questions/37351172/how-to-remove-index-from-a-created-dataframe-in-python) – ThePyGuy Jun 07 '21 at 03:51
  • I don't see any column Index here, can you point out where the column index is? By the column index, do you mean the column names `Column1` and `Column2`? – ThePyGuy Jun 07 '21 at 04:10

2 Answers2

0

You can remove the index and header when export to excel, simply adding the following conditions:

df.to_excel("test.xlsx", header = None, index = False)
Aster Hu
  • 56
  • 3
  • @cha hey could you please be more specific? I didn't see any image was mentioned. Do you mean add borders in excel? – Aster Hu Jun 07 '21 at 04:39
  • I think `XlsxWriter` can do that but unfortunately I'm not familiar with this. Sorry :( – Aster Hu Jun 07 '21 at 04:52
0

It can be done like this.

import pandas as pd

dataset = pd.DataFrame({'A':[1,2,3,4], 'B':[5,6,7,8]})

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
dataset.to_excel(writer, sheet_name = 'Data', index = False, header = False)

sheet_name = 'Images' #Sheet name in which the image will be generated
cell = 'B2' #Position of the image in w.r.t cell value

workbook  = writer.book
worksheet = workbook.add_worksheet(sheet_name)
worksheet.insert_image(cell, 'Tmp.jpg') #Add image
workbook.close()
writer.save()

DumbCoder
  • 233
  • 2
  • 9
  • I have never done it myself before but checkout this [link](https://stackoverflow.com/questions/56428445/extract-images-from-word-document-using-python). I think it will solve ur issue. – DumbCoder Jun 07 '21 at 12:50