0

I used this code and output was the entire page text and i want the specific text? any solutions please?

public class ReadingText {
 public static void main(String args[]) throws IOException {
    File file = new File("C:/PdfBox_Examples/new.pdf");
      PDDocument document = PDDocument.load(file);
      PDFTextStripper pdfStripper = new PDFTextStripper();
      String text = pdfStripper.getText(document);
      System.out.println(text);
      document.close();
    }
}
Prashant Gupta
  • 788
  • 8
  • 26
  • 4
    What do you think `pdfStripper.getText(document);` means? What spesific text are you looking for? Can you provide examples on what you are getting, and examples on what you want? – Mr.Turtle Nov 19 '18 at 11:16
  • Isn't it obvious that you have to somehow "parse" the text string you already got by `getText` then? – UninformedUser Nov 19 '18 at 11:18
  • 2
    Please edit your question to clarify what "specific text" means. An area? A table element? A page? Text with specific font? Text in specific language? – Tilman Hausherr Nov 19 '18 at 11:25
  • a text from specific area which is selected by users – Bilal Zaveri Nov 20 '18 at 12:36
  • Then try the `ExtractTextByArea.java` example in the source code download or here: https://svn.apache.org/viewvc/pdfbox/branches/2.0/examples/src/main/java/org/apache/pdfbox/examples/util/ExtractTextByArea.java?view=markup – Tilman Hausherr Nov 20 '18 at 20:36
  • If the comment answered your question, then I think your question is a duplicate of https://stackoverflow.com/questions/28276893/reading-a-table-or-cell-value-in-a-pdf-file-using-java/28295244#28295244 So you should delete yours and upvote that one (if it helps). You can also answer this question yourself, but in that case you should improve it first as explained in the comments. – Tilman Hausherr Nov 22 '18 at 09:52

0 Answers0