-3

I am using XSLFPowerPointExtractor to extract text from a pptx file. However all the text in the pptx file is returned to me in a single string. Is there anyway i can get the text on each slide separately? I am completely new to this concept, so please give detailed answers..

David Brossard
  • 13,584
  • 6
  • 55
  • 88

1 Answers1

0

I looked up the API documentation and it seems that it's either all or nothing. The API documentation has a method called getText() which returns the entire text for all the slides which is exactly the behavior you are observing.

A bit more googling showed me that the way to do it is to use another API namely XMLSlideShow. That gives you a slide-by-slide access to the presentation.

From there, you can access the different shapes including the text areas from which you can read the text. As a matter of fact, this is explained in this other SO question which I believe will help you resolve your issue: How to get pptx slide notes text using apache poi?

Community
  • 1
  • 1
David Brossard
  • 13,584
  • 6
  • 55
  • 88