The same question has been asked here and here But I couldn't find a way to extract only the headings from the pdf file. Let's say a pdf file has been generated from a word document, which has structured headings and paragraphs written inside it. Now, what I'd like to do is to extract all the headings along with its paragraphs written inside it in a dictionary form.
Is there any way I can achieve this functionality in python, if yes, would appreciate an initial guide. Thank you