3

I downloaded PDFBox 1.8.6 from Apache

I then copied it into a 'res' folder I created at the root my Java/Eclipse project. I then right clicked on the project, went to properties, then into Java Build Path, then in Libraries, then I clicked on Add JARS... and added it and its documentation.

Here are the results

From then on, in my code, I could import(ish) PDFBox.

For example, I can see:

import org.apache.pdfbox.pdmodel.*;

But, if you want to do something useful, you need to import more, often in the form of:

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.font.PDFont;

For some odd reason, I see don't see these...

here is a screenshot of the imports...

Could anyone elucidate this for me, please?

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
SuperWoman
  • 107
  • 1
  • 10

6 Answers6

1

the current file linked at the official source is not the correct one. It weighs in at 28K instead of a few megabyte. Wow, for once its not me! hahaha

SuperWoman
  • 107
  • 1
  • 10
  • Which file, which link? – Tilman Hausherr Sep 22 '14 at 06:28
  • Hello Tilman, http://pdfbox.apache.org/downloads.html#recent gives you the impression that you can right click and download, but you can't (http://www.apache.org/dyn/closer.cgi/pdfbox/1.8.7/pdfbox-1.8.7.jar) since it results in a file that is 27.3kb. You have to left click, follow the rabbit down the link to this other page and then you can download from there. – SuperWoman Sep 22 '14 at 14:52
  • I've forwarded your complaint to The Master and there's now a new link without this problem: http://pdfbox.apache.org/downloads.cgi – Tilman Hausherr Sep 23 '14 at 21:39
0

The relevant packages are all there in your second screenshot. You have to select the package first, from the list that's displayed; then press Ctrl-Space to be able to select the classes within each package. From the look of your second screenshot, it all seems to be working just right.

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
  • Hi David, looks can be deceiving... otherwise, I would be able to access the org.apache.pdfbox.pdmodel.PDDocument. But as shown in the screenshot, there's nothing showing up for 'p', hence this post. http://pdfbox.apache.org/docs/1.8.3/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html Have you tried it? – SuperWoman Sep 22 '14 at 00:11
  • @SuperWoman Even i could not find the class file `PDFieldTreeNode` so i used `2.0.0-SNAPSHOT` – Ankur Singhal Sep 23 '14 at 10:37
0

The problem is that you have imported both the javadoc jar and the actual jar as your build libraries.

I tried doing that right now, and I got exactly the same problem as you do.

Remove the javadoc from the build path. If you want to connect the javadoc to the pdfbox jar, you should click the triangle to the left of the real jar, select Javadoc location, click "Edit...", and then select "Javadoc in archive" and select your archive.

By the way, I may be wrong, but the pdfbox-app-1.8.6.jar seems to be for the command line apps. The one to be used for building your own projects is probably the lighter pdfbox-1.8.6.jar.

RealSkeptic
  • 33,993
  • 7
  • 53
  • 79
  • I recommend to get the app because it has most of the other needed libraries. pdfbox*jar alone is not enough, you need fontbox and jempbox. – Tilman Hausherr Oct 28 '14 at 20:58
  • @TilmanHausherr but it contains tons of other stuff like bouncycastle libraries, which may not be the most current and may conflict with other projects. – RealSkeptic Oct 28 '14 at 21:02
  • That is true. You can of course also use maven. But using pdfbox*.jar alone will not make you happy. – Tilman Hausherr Oct 29 '14 at 07:13
  • The real cause of the problem was that "SuperWoman" imported a HTML file, this is explained in her own answer. The download page has been changed since then to avoid that. – Tilman Hausherr Oct 29 '14 at 07:17
  • @TilmanHausherr I think that answer was not to the question itself, but to the fact that she uses the 1.8.6 jars rather than the recent ones. I doubt that she would have gotten that screen shot if she had an HTML file added as a jar to her build path. I have verified my own answer before posting. – RealSkeptic Oct 29 '14 at 07:38
0

i am now using latest 1.8.7 but after adding it to libs and setting the jar file to "add build path" and also checking it in order and export.. it give the same error that

 "10-28 13:45:14.510: E/AndroidRuntime(1630): java.lang.NoClassDefFoundError: org.apache.pdfbox.pdmodel.PDDocument"..

i actually wasted 5 hours on that.... but then i found iText for same PDF purpose.. Runs good...

Link to iText Tutorial

http://zacktutorials.blogspot.com/2014/07/android-read-and-write-pdf-file-using.html

Jamil
  • 5,457
  • 4
  • 26
  • 29
  • PDFBox doesn't work with android anyway. There's a derived project from a guy, but that one is mostly to create PDFs, you can't render PDFs with that one. – Tilman Hausherr Oct 28 '14 at 21:01
0

For your problem I must say that you need to change your pdfbox jar file and also download many supporting jars or you can try the code below.

Here the code using pdfbox and apache tika to parse a pdf file and save the output to a location.

You will need these following jars.

bcprov-1.45.jar fontbox-1.5.0.jar org.apache.tika.jar org.apache.tika.parsers.jar pdfbox-1.3.1.jar

package readpdf;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.ContentHandler;

public class readpdf {
  public static void main(String args[]) throws Exception {


        File file = new File("OutputFile");

        // if file doesnt exists, then create it
        if (!file.exists()) {
            file.createNewFile();
        }

        FileWriter fw = new FileWriter(file.getAbsoluteFile());
        BufferedWriter bw = new BufferedWriter(fw);

    InputStream is = null;
    OutputStream o=null;
    try {
      is = new FileInputStream(InputFile);
      ContentHandler contenthandler = new BodyContentHandler();
      Metadata metadata = new Metadata();
      AutoDetectParser parser = new AutoDetectParser();
      parser.parse(is, contenthandler, metadata);
      System.out.println(contenthandler.toString());
      bw.write(contenthandler.toString());
        bw.close();
      //String[] a = metadata.names();

   /*  for(int i = 0;i< a.length-1;i++)
      {
          System.out.println(a[i]);
      }*/

      //System.out.println("title = "+metadata.get("title"));
     // System.out.println("Author = "+metadata.get("Author"));
    //  System.out.println("Content-Type = "+metadata.get("Content-Type"));
     // System.out.println("Producer = "+metadata.get("producer"));
     // System.out.println("Created = "+metadata.get("created"));
     // System.out.println("Last-Modified = "+metadata.get("Last-Modified"));
      System.out.println("*******************Content of PDF ********************");
      System.out.println(contenthandler.toString());

    }
    catch (Exception e) {
      e.printStackTrace();
    }
    finally {
        if (is != null) is.close();
    }
  }
}
Kshitij Kulshrestha
  • 2,032
  • 1
  • 20
  • 27
0

I am not sure what you are tring, but I tried replicating your steps

  1. Downloaded the pdfbox-app-1.8.7.jar from https://pdfbox.apache.org/download.cgi
  2. Imported the jar to my project build path. PS: You dont need to add the javadoc to your build path. Thats a seperate process.
  3. Tried an import of "org.apache.pdfbox.pdmodel.PDDocument;" (Typed till org.apache.pdfbox.pdmodel. and then Ctrl+Space to get Class hints). It shows up properly. enter image description here
  4. The same for the other packages also. eg. for "org.apache.pdfbox.pdmodel.font.PDFont" you need to type till "org.apache.pdfbox.pdmodel.font." and then press Ctrl+Space to get Class hints.enter image description here
  5. I think you did a "import org.apache.pdfbox.pdmodel.*;" so, it shows all the packages. For the link 3,4,5 select the corresponding package from the suggestion list and then press Ctrl+Space to get Class suggestions inside that package. Also remove the javadoc from your buildpath and check.

Hope this helps