0

I am trying to split a pdf document through org.apache.pdfbox.multipdf.Splitter and need to perform certain file operations on this single page PDDocument,

How can I convert PDDocument to File Object in java?

Shabbir Dhangot
  • 8,954
  • 10
  • 58
  • 80
Chirayu Desai
  • 31
  • 1
  • 8

2 Answers2

1

Very simple. I am using 1.8.16

    try {
        PDDocument document = PDDocument.load(new File(filename));


        // do what ever you want
        document.save(newfilename);



    } catch (IOException | BadSecurityHandlerException | CryptographyException e) {         
        e.printStackTrace();
    }
    finally {
        if(document != null )
            try {
                document.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
            e.printStackTrace();
        }
        //return tmpFile != null ? tmpFile.getAbsolutePath() : null;
        return tmpFilename;
    }
jprism
  • 3,239
  • 3
  • 40
  • 56
  • The OP indirectly has indicated that he uses a 2.x version (the `Splitter` class has been moved to the `org.apache.pdfbox.multipdf` package not before 2.0). Thus, version 1.8.16 references are not really helpful for him. – mkl Oct 18 '18 at 04:54
0

with Apache commons

   InputStream is = null
    try {
     PDDocument document = PDDocument.load(filePath);
     File targetFile = new File("nameoffile.pdf");
     PDStream ps = new PDStream(document);
     is = ps.createInputStream();
     FileUtils.copyInputStreamToFile(is, targetFile);
    } catch (IOException io) {} finally {
     if (is != null)
      IOUtils.closeQuietly(is);
    }
jprism
  • 3,239
  • 3
  • 40
  • 56
kuhajeyan
  • 10,727
  • 10
  • 46
  • 71
  • Your code is partly incorrect (there is no `inputStream` method in `PDStream`, and the `load()` method doesn't accept a string in 2.0.2), and it creates an empty file. And you call `inputStream()` twice, and once without closing. – Tilman Hausherr Aug 29 '16 at 10:16
  • @TilmanHausherr thx for pointing out, seems PDDocument have a load overloaded method with file path https://pdfbox.apache.org/docs/1.8.10/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html#load(java.lang.String). – kuhajeyan Aug 29 '16 at 10:23
  • The OP indirectly has indicated that he uses a 2.x version (the `Splitter` class has been moved to the `org.apache.pdfbox.multipdf` package not before 2.0). Thus, version 1.8.10 references are not really helpful for him. – mkl Aug 29 '16 at 15:27
  • 1
    Regardless of the version, after correcting the errors the result file will always be empty. `new PDStream(document)` means to create an empty stream, with the scratch file setings of the document in the parameter. This answer should be deleted. – Tilman Hausherr Aug 30 '16 at 09:51