0

Image of the table i need to extract the highlighted text from.

I need to write a python script which helps me convert this docx table to csv and just writing the highlighted information from a row in a csv. Like the column name would be "overall verdict" so its value underneath it must be "4". Please if anyone can help, It would be appericiated.

  • 1
    Possible duplicate of [Extracting Highlighted Words from Word Document (.docx) in Python](https://stackoverflow.com/questions/9562671/extracting-highlighted-words-from-word-document-docx-in-python) – jinawee Apr 24 '19 at 10:19
  • but document.xpath function doesn't work.......error document has no attribute named xpath. – Ashish Mittal Apr 25 '19 at 08:41
  • What docx version are you using? https://stackoverflow.com/questions/23376105/search-and-replace-in-python-docx You can search in your docx installation to verify that xpath doesn't exist. – jinawee Apr 25 '19 at 09:22
  • yeah so that post says that xpath is no longer available in docx module.... so how i am gonna do the highlighted text extraction?? – Ashish Mittal Apr 26 '19 at 08:50
  • Don't know. You can download the older version. Or you could use a generic xml parser to read the unzipped docx, the format is in the linked question. Or you could write a basic search yourself... – jinawee Apr 26 '19 at 12:18

0 Answers0