-2

Given a PDF file with a page of any paper size(A0, A1, custom, etc), how can I split the page into different pages, each of the same size say, A4 and save it to a new PDF document in java? I tried using iText library but I had no success.

Original PDF Page

A4 sized split

Final PDF file with A4 pages

dracodormiens
  • 510
  • 4
  • 21
  • Nothing much. I thought of using iText but I'm new to the library and wasn't able to understand how to proceed. Also I found this online tool called [Sejda](https://www.sejda.com/split-pdf-down-the-middle) but I want to code this in Java and I don't want to split the page just in half. – dracodormiens Sep 28 '17 at 10:31
  • You may want to add a graphic illustration of what you want to achieve. Currently this is quite unclear. – Thorbjørn Ravn Andersen Sep 28 '17 at 11:02
  • 1
    **For iText 5.5.x**: The `PdfVeryDenseMergeTool` from [this answer](https://stackoverflow.com/a/29078954/1729265) might help you: While merging multiple documents, it splits pages to fill as much as possible of the target page size. Thus, applying that tool to a single input PDF should do what you want. **For iText 7.0.x**: The above mentioned `PdfVeryDenseMergeTool` has been ported to iText 7, you find it as `PdfDenseMerger` in com.itextpdf:samples package. – mkl Sep 28 '17 at 12:00
  • The images you added to the question seem to indicate that you don't want to *split* but instead to *scale* the original page. Is that the case? Then adapt your question text. Is it not? Then please do explain. By the way, *this online tool called Sejda* you found does still something different than either what your question text describes or what the images appear to indicate. – mkl Sep 29 '17 at 09:51
  • @mkl I have updated the description and the images. I hope it will clarify my problem. I don't want to scale the original page. I just want to split it into different pages having a specified page size(in the above example A4). The tool Sejda just splits the original page into half and not into many equally sized pages. – dracodormiens Sep 29 '17 at 10:56
  • So you want to split the given (possibly huge) page into as many A4 sized tiles as necessary? (I ask to be sure as the final image appears to have much smaller text and all pages have margins which after splitting a page into tiles the inner tile borders wouldn't have. – mkl Sep 29 '17 at 12:08
  • @mkl Yes.. I want to split the larger page into A4 sized tiles.. I wasn't able to recreate the pages properly according to the problem statement so please excuse my screenshots.. – dracodormiens Oct 03 '17 at 10:53

1 Answers1

2

This task is similar to the task implemented in this answer, the splitting merely is not only in one dimension but in two dimensions.

A possible solution, therefore, can be implemented along the lines of the AbstractPdfPageSplittingTool used there.

So, again we have an abstract splitter class

public abstract class Abstract2DPdfPageSplittingTool {
    public void split(OutputStream outputStream, PdfReader... inputs) throws DocumentException, IOException {
        try {
            initDocument(outputStream);
            for (PdfReader reader : inputs) {
                split(reader);
            }
        } finally {
            closeDocument();
        }
    }

    void initDocument(OutputStream outputStream) throws DocumentException {
        final Document document = new Document(PageSize.A4);
        final PdfWriter writer = PdfWriter.getInstance(document, outputStream);
        this.document = document;
        this.writer = writer;
    }

    void closeDocument() {
        try {
            document.close();
        } finally {
            this.document = null;
            this.writer = null;
        }
    }

    void newPage(Rectangle pageSize) {
        document.setPageSize(pageSize);
        if (!document.isOpen())
            document.open();
        else
            document.newPage();
    }

    void split(PdfReader reader) throws IOException {
        for (int page = 1; page <= reader.getNumberOfPages(); page++) {
            split(reader, page);
        }
    }

    void split(PdfReader reader, int page) throws IOException
    {
        PdfImportedPage importedPage = writer.getImportedPage(reader, page);

        Rectangle pageSizeToImport = reader.getPageSize(page);
        Iterable<Rectangle> rectangles = determineSplitRectangles(reader, page);

        for (Rectangle rectangle : rectangles) {
            newPage(rectangle);
            PdfContentByte directContent = writer.getDirectContent();
            directContent.saveState();
            directContent.rectangle(rectangle.getLeft(), rectangle.getBottom(), rectangle.getWidth(), rectangle.getHeight());
            directContent.clip();
            directContent.newPath();

            writer.getDirectContent().addTemplate(importedPage, -pageSizeToImport.getLeft(), -pageSizeToImport.getBottom());

            directContent.restoreState();
        }
    }

    protected abstract Iterable<Rectangle> determineSplitRectangles(PdfReader reader, int page);

    Document document = null;
    PdfWriter writer = null;
}

(Abstract2DPdfPageSplittingTool.java)

This utility allows to split each source page into a custom set of result pages each of which may represent an arbitrary rectangular (with edges parallel to the corresponding page edges) part of the source page.

You can use the tool like this for splitting pages along a grid of A4 cells:

Abstract2DPdfPageSplittingTool tool = new Abstract2DPdfPageSplittingTool() {
    @Override
    protected Iterable<Rectangle> determineSplitRectangles(PdfReader reader, int page) {
        Rectangle targetSize = PageSize.A4;
        List<Rectangle> rectangles = new ArrayList<>();
        Rectangle pageSize = reader.getPageSize(page);
        for (float y = pageSize.getTop(); y > pageSize.getBottom() + 5; y-=targetSize.getHeight()) {
            for (float x = pageSize.getLeft(); x < pageSize.getRight() - 5; x+=targetSize.getWidth()) {
                rectangles.add(new Rectangle(x, y - targetSize.getHeight(), x + targetSize.getWidth(), y));
            }
        }
        return rectangles;
    }
};
tool.split(RESULT_OUTPUT_STREAM, new PdfReader(SOURCE_PDF));

(based on the SplitPages test testSplitDocumentA6)

Because there is a certain tolerance in the specification of paper sizes, I drop Rectangles which would contain only a very small band from the original document on the left or at the top to prevent empty result pages, assuming that these bands are completely contained in the original page margin. If you don't want that, remove (or change) the + 5 or - 5 in the sample determineSplitRectangles loops.

mkl
  • 90,588
  • 15
  • 125
  • 265