I am currently developing a proprietary PDF parser that can read multiple types of documents with various types of data. Before starting, I was thinking about if reading PowerPoint slides was possible. My employer uses presentation guidelines that requires imagery and background designs - is it possible to build a parser that can read the data from these PowerPoint PDFs without the slide decor getting in the way?
So the workflow would basically be this:
- At the end of a project, the project report is delivered in the form of a presentation.
- The presentation would be converted to PDF.
- The PDF would be submitted to my application.
- The application would read the slides and create a data-focused report for quick review.
The goal of the application is to cut down on the amount of reading that needs to be done by a significant amount as some of these presentation reports can be many pages long with not enough time in the day.