I am trying to extract heading 1 from documents stored in a directory.
I am extremely new to python, so my experience is extremely limited.
My code below does not work, it has syntax and structural errors.
The code returns an error document not defined.
import os
from docx import Document
#document = Document('C:\\Users\\Work\\Desktop\\Docs')
mydir ="C:\\Users\\Work\\Desktop\\Docs\\"
for arch in os.listdir(mydir):
archpath = os.path.join(mydir, arch)
with open(archpath) as f:
for paragraph in document.paragraphs:
if paragraph.style.name == 'Heading 1':
print(paragraph.text)
document.save = Document('headings.docx')
I have researched both on stack and on the internet, but I have not found anything that shows how to loop through documents in a folder.
Have I set the code up in the correct manner? How can I loop through documents in a directory and extract the headings 1 to a new document.