0

Here is my situation: I have a .doc file which like this:

Headline 1

headline 1.1
paragraph paragraph paragraph paragraph

headline 1.2
paragraph paragraph paragraph paragraph




Headline 2

headline 2.1
paragraph paragraph paragraph paragraph

headline 2.2
paragraph paragraph paragraph paragraph

And I want to extract the whole headline 1 section, how can I make it with python?

Revenant
  • 317
  • 3
  • 10
  • 1
    You should consider pasting the code you wrote when attempting to solve the task so that others can provide better help. Please refer to the help center on [how to ask a good question](https://stackoverflow.com/help/how-to-ask) – damores Feb 28 '18 at 04:01
  • Possible duplicate of [How to extract text from an existing docx file using python-docx](https://stackoverflow.com/questions/25228106/how-to-extract-text-from-an-existing-docx-file-using-python-docx) – Joe Feb 28 '18 at 06:44
  • Since docx files are basically zipped xml files you can also read them using ElementTree that ships with Python http://etienned.github.io/posts/extract-text-from-word-docx-simply/ – Joe Feb 28 '18 at 06:47

0 Answers0