0

I am reading specific location from an input pdf file with n number of pages and making a list of the texts on those locations. Then I write a new pdf document and write those strings from the list into a table with cells. I came up with two main problems.

  1. I want to have three columns in the table but if my strings in the list were not a multiple of 3 (i.e., the number of columns) then it would leave extra strings and would not print them. For example if I have 4 strings to print then the program would print the first three strings in three cells in the first row but would leave one string. I wrote some code that checks the number of strings and gets it mod (%) with 3 and adds some blank cells with a dot (.) in it to supply with the extra cells to complete the row so that none of the strings are left. Is there a better way to do it?

  2. The program runs in intellij when I run the main class and generates the output pdf file for me. But when I make the executable jar and run it by double clicking it does nothing. To double check, I ran the jar in the intellij terminal and found out that it throws the following error: enter image description here

Now why does not it give the same problem when I run it in intellij? How do I overcome this problem? I re-wrote the whole project in Eclipse and eclipse does not comile it at all and gives the same problem that the running of executable is giving on command line inside intellij.

Here are my three classes that I have in the project:

package addressLabels;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.FilteredTextRenderListener;
import com.itextpdf.text.pdf.parser.LocationTextExtractionStrategy;
import com.itextpdf.text.pdf.parser.PdfTextExtractor;
import com.itextpdf.text.pdf.parser.RegionTextRenderFilter;
import com.itextpdf.text.pdf.parser.RenderFilter;
import com.itextpdf.text.pdf.parser.TextExtractionStrategy;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class Driver {
    public static final String SRC = "C:/temp/ebay.pdf";

    public static void main(String[] args) throws IOException, DocumentException {
        ReadCertainLocationOnPageInPdf contentsObj = new ReadCertainLocationOnPageInPdf(SRC);
        WritePdf writer = new WritePdf(contentsObj.getListOfAddresses());
        //contentsObj.printListOfAddresses();
    }

}//class Driver ends here.

package addressLabels;


import com.itextpdf.text.Rectangle;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.*;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class ReadCertainLocationOnPageInPdf {

    //private String cleanTextMarkedForTokenization;
    private List<String> listOfAddresses;

    public ReadCertainLocationOnPageInPdf(String pdfFileAddress){
        this.listOfAddresses = new ArrayList<String>();
        parsePdf(pdfFileAddress);
    }//constructor ends here.

    private void parsePdf(String pdfFileAddress) {

        File f = new File(pdfFileAddress);
        if (f.isFile() && f.canRead()){
            try {
                PdfReader reader = new PdfReader(pdfFileAddress);
                int numPages = reader.getNumberOfPages();

                //Get information about the page size
                //Rectangle mediabox = reader.getPageSize(1);
                //printDataAboutThisPage(mediabox);
                //StringBuilder sb = new StringBuilder("");
                for (int pageNum = 1; pageNum <= numPages; pageNum++){
                    String oneAddress = getTextFromThisPage(pageNum, reader);
                    this.addOneAddressToListOfAddresses(oneAddress);
                    //sb.append(getTextFromThisPage(pageNum, reader)).append("\n\n");
                }
                //this.addOneAddressToListOfAddresses(sb.toString());

                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }//if ends here
        //System.out.println(sb.toString());
    }

    private void printDataAboutThisPage(Rectangle mediabox) {
        //Lower left corner is x
        float x = mediabox.getRight();
        float y = mediabox.getTop();
        System.out.println("Lower left corner: " + x);
        System.out.println("Upper right conrner: " + y);
        System.out.println("The values of x increase from left to right; the values of y increase from bottom to top. \n The unit of the measurement system in PDF is called \"user unit\". \n By default one user unit coincides with one point (this can change, but you won't find many PDFs with a different UserUnit value).\n In normal circumstances, 72 user units = 1 inch.");
    }

    private String getTextFromThisPage(int pageNo, PdfReader reader) throws IOException {
        //java.awt.geom.Rectangle2D rect = new java.awt.geom.Rectangle2D.Float(226, 547, 240, 158);
        java.awt.geom.Rectangle2D rect = new java.awt.geom.Rectangle2D.Float(226, 547, 240, 158);
        RenderFilter regionFilter = new RegionTextRenderFilter(rect);
        TextExtractionStrategy strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), regionFilter);
        String t = PdfTextExtractor.getTextFromPage(reader, pageNo, strategy);
        t = this.cleanOneLabel(t);
        return t;
    }

    private String cleanOneLabel(String t) {
        StringBuilder sb2 = new StringBuilder("");
        String[] lines = t.split(System.getProperty("line.separator"));
        for(String s:lines) {
            if(!s.equals(""))
                sb2.append(s).append("\n");
        }
        String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
        String replacement = "";
        return sb2.toString().replaceAll(pattern, replacement);// ??? s = s.replaceAll("\n+", "\n");

    }
    private String cleanOneLabel2(String t) {
        StringBuilder sb2 = new StringBuilder("");
        String[] lines = t.split(System.getProperty("line.separator"));
        for(int i = 0; i < lines.length; i++) {
            if(lines[i].contains("Post to:")) {
                lines[i] = lines[i].replace("Post to:", "pakbay-Post to:");
            }
        }
        for(String s:lines) {
            if(!s.equals(""))
                sb2.append(s).append("\n");
        }
        String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
        String replacement = "";
        return sb2.toString().replaceAll(pattern, replacement);// ??? s = s.replaceAll("\n+", "\n");

    }

    public List<String> getListOfAddresses(){
        return this.listOfAddresses;
    }

    public void printListOfAddresses(){
        for(int i = 0; i < listOfAddresses.size(); i++){
            System.out.print(listOfAddresses.get(i));
        }
    }

    public void addOneAddressToListOfAddresses(String oneAddress) {
        //clean the string before adding it to the list of addresses.
        //Remove extra spaces, tabs and blank lines from the passed string.
        String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
        String replacement = "";
        oneAddress = oneAddress.replaceAll(pattern, replacement);
        //Add the cleaned address to the list of addresses.
        this.listOfAddresses.add(oneAddress);
    }
}//class ReadCertainLocationOnPageInPdf ends here.

package addressLabels;

import java.io.FileOutputStream;
import java.util.Date;

import com.itextpdf.text.*;
import com.itextpdf.text.pdf.PdfPCell;
import com.itextpdf.text.pdf.PdfPTable;
import com.itextpdf.text.pdf.PdfWriter;

public class WritePdf {
    private static String FILE = "C:/temp/ebay-output.pdf";
    private java.util.List<String> listOfAddresses;


    public WritePdf(java.util.List<String> listOfAddresses) {
        this.listOfAddresses = listOfAddresses;
        System.out.println("Size: " + this.getListOfAddresses().size());
        System.out.println("Element at zeroth position in list: " + this.getListOfAddresses().get(0));
        System.out.println("Element at nth position in list: " + this.getListOfAddresses().get(this.getListOfAddresses().size()-1));
        writeTheListOnPdf();
    }

    private void writeTheListOnPdf() {
        try {
            Document document = new Document();
            PdfWriter.getInstance(document, new FileOutputStream(FILE));
            document.open();
            addMetaData(document);
            //addTitlePage(document);
            addContent(document);
            document.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private void addContent(Document document) throws DocumentException{
        PdfPTable table  = makeTable();
        for (int i = 0; i < this.getListOfAddresses().size() ; i++) {
            PdfPCell cell = makeCell();
            cell.addElement(new Phrase(this.getListOfAddresses().get(i)));
            table.addCell(cell);
        }
        /* we have three columns in the table. If the number of addresses is not exactly equal to the number of
         * cells created then the pdf file is corrupt and the program throws error. So we have to add some extra cells
         * to complete a row. */
        calculateAndAddExtraCells(table);
        document.add(table);
    }

    private void calculateAndAddExtraCells(PdfPTable table) {

        int numOfAddresses = this.getListOfAddresses().size();
        int numOfExtraCells = this.getListOfAddresses().size()%3;
        int loopCounter = 0;

        if (numOfExtraCells == 0)
            loopCounter = 3;
        else if (numOfExtraCells == 1)
            loopCounter = 2;
        else if (numOfExtraCells == 2)
            loopCounter = 1;

        for (int i = 1; i <= loopCounter ; i++) {
            PdfPCell blankCell = this.makeCell();
            blankCell.addElement(new Phrase("."));
            table.addCell(blankCell);
        }
    }

    private PdfPCell makeCell() {
        PdfPCell cell = new PdfPCell();
        cell.setPadding(4);
        //cell.setNoWrap(true);
        cell.setHorizontalAlignment(PdfPCell.ALIGN_CENTER);
        cell.setVerticalAlignment(PdfPCell.ALIGN_CENTER);
        cell.setBorder(Rectangle.NO_BORDER);
        return cell;
    }

    private PdfPTable makeTable() {
        PdfPTable table = new PdfPTable(3);
        table.setWidthPercentage(100);
        table.setSplitRows(false);
        return table;
    }

    private void addMetaData(Document document) {
        document.addTitle("Address labels for the input pdf file");
        document.addSubject("Address labels");
        document.addKeywords("ebay, amazon, addresses, labels");
        document.addAuthor("Ajmal Khan");
        document.addCreator("Ajmal Khan");
    }

    public java.util.List<String> getListOfAddresses() {
        return listOfAddresses;
    }

    public void setListOfAddresses(java.util.List<String> listOfAddresses) {
        this.listOfAddresses = listOfAddresses;
    }
}//writePdf ends here.

Here is the pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.swedishnow</groupId>
    <artifactId>ebayAddresses</artifactId>
    <version>1.0-SNAPSHOT</version>

    <build>
        <plugins>
            <plugin>

                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <mianClass>addressLabels.Driver</mianClass>
                        </manifest>
                    </archive>
                </configuration>

            </plugin>
        </plugins>
    </build>

    <dependencies>

        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>kernel</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>layout</artifactId>
            <version>7.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.18</version>
        </dependency>
        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>itext-xtra</artifactId>
            <version>5.5.4</version>
        </dependency>

        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>itextpdf</artifactId>
            <version>5.5.9</version>
        </dependency>


    </dependencies>

</project>

I use the methods recommended in this video for creating the executable jar in intellij Community 2018.1.5 Edition.

Amedee Van Gasse
  • 7,280
  • 5
  • 55
  • 101
Emil Jan
  • 41
  • 2
  • 1
    please do not add screenshots with error messages, but add them as text so that people can read them directly and there is a chance for search engines to find this post by searching for the error message's content. – Michael Lihs Jul 12 '18 at 23:12
  • You are mixing incompatible versions of iText! `itext-xtra` **MUST** be the same version as `itextpdf` (latest released version: `5.5.12`) and your code is typical of iText 5 code so your dependency on the `kernel` and `layout` modules of iText 7 (latest released version: `7.1.2`) is not needed. – Amedee Van Gasse Jul 13 '18 at 06:59
  • 1
    You have 2 questions that are unrelated. You **MUST** create 2 separate questions. Those are the rules of Stack Overflow, I didn't make them. You risk getting your question flagged as "Too Broad". – Amedee Van Gasse Jul 13 '18 at 07:01

2 Answers2

0

I want to have three columns in the table but if my strings in the list were not a multiple of 3 (i.e., the number of columns) then it would leave extra strings and would not print them. For example if I have 4 strings to print then the program would print the first three strings in three cells in the first row but would leave one string. I wrote some code that checks the number of strings and gets it mod (%) with 3 and adds some blank cells with a dot (.) in it to supply with the extra cells to complete the row so that none of the strings are left. Is there a better way to do it?

There is a PdfPTable method filling the final row for you:

/**
 * Completes the current row with the default cell. An incomplete row will
 * be dropped but calling this method will make sure that it will be present
 * in the table.
 */
public void completeRow()

Simply call this method before adding the table to the document.


I ran the jar in the intellij terminal and found out that it throws the following error

NoSuchMethodError: com.itextpdf.text.pdf.parser.RegionTextRenderFilter.<init>(Ljava/awt/geom/Rectangle2D;)V

That error is appropriate, the only constructor of the iText5 RegionTextRenderFilter accepting a Rectangle2D does not require a java.awt.geom.Rectangle2D but instead a com.itextpdf.awt.geom.Rectangle2D.

That the program runs in intellij, therefore, is surprising, not that it fails otherwise.

This might be a follow-up problem to your inconsistent dependencies. As @Amedee already mentioned in a comment:

You are mixing incompatible versions of iText! itext-xtra MUST be the same version as itextpdf (latest released version: 5.5.12) and your code is typical of iText 5 code so your dependency on the kernel and layout modules of iText 7 (latest released version: 7.1.2) is not needed.

Community
  • 1
  • 1
mkl
  • 90,588
  • 15
  • 125
  • 265
0

I found out solutions to both the problems. For the problem of creating the installer I added the following dependencies to make it work:

<dependencies>
<dependency>
  <groupId>commons-lang</groupId>
  <artifactId>commons-lang</artifactId>
  <version>2.1</version>
</dependency>
<dependency>
  <groupId>org.codehaus.plexus</groupId>
  <artifactId>plexus-utils</artifactId>
  <version>1.1</version>
</dependency>

I also deleted the itextpdf jar file that I had in my lib folder. The updated and working xml file that generated the working executable jars for me now looks like this:

<?xml version="1.0" encoding="UTF-8"?>

http://maven.apache.org/xsd/maven-4.0.0.xsd"> 4.0.0

<groupId>com.khanajmal</groupId>
<artifactId>amazon</artifactId>
<version>1.0-SNAPSHOT</version>

<build>
    <plugins>
        <plugin>

            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-jar-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <addClasspath>true</addClasspath>
                        <mianClass>addressLabels.Driver</mianClass>
                    </manifest>
                </archive>
            </configuration>

        </plugin>
    </plugins>
</build>

<dependencies>
    <dependency>
        <groupId>commons-lang</groupId>
        <artifactId>commons-lang</artifactId>
        <version>2.1</version>
    </dependency>
    <dependency>
        <groupId>org.codehaus.plexus</groupId>
        <artifactId>plexus-utils</artifactId>
        <version>1.1</version>
    </dependency>

    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-log4j12</artifactId>
        <version>1.7.18</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.itextpdf/itext-xtra -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>itext-xtra</artifactId>
        <version>5.4.5</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.itextpdf/itextpdf -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>itextpdf</artifactId>
        <version>5.5.4</version>
    </dependency>

</dependencies>

Another notable step was that intellij created the META-INF folder in src/main/java/ which I moved to src/main/resources after reading it somwhere in a post on stackoverflow.com.

The output jar was created in /out/artifacts/the_name_given_by_you while making the artifact.

The second problem of finding out a better solution than inserting empty cells with a dot in it as fillers to complete a row was the use of method table.completeRow() but I did not use it earlier as the cells created by this method had borders while all my other cells had no borders. When my method makeCell() creates a cell, it sets the border to NO_BORDER. Due to that I created the required number of extra cells by putting a dot in them (or even just "") so that the problem is solved. Now in the final version I am using the method table.completeRow() but then I set the border to NOBORDER for the last row of the table again using the following code.

table.completeRow();
    //When the method table.completeRow() inserts extra cells; they have borders. Removing them below.
    int lastRowIndex = table.getLastCompletedRowIndex();
    PdfPCell[] cells = table.getRow(lastRowIndex).getCells();
    for (int i = 0; i < cells.length; i++) {
        PdfPCell c = cells[i];
        c.setBorder(Rectangle.NO_BORDER);
    }
    //calculateAndAddExtraCells(table);
    document.add(table);

I created a new project from scratch to avoid some new problems so there are some differences between the earlier pom.xml and this pom.xml.

So it works this way as well as the earlier code.

Thank you to all who tried to help.

Emil Jan
  • 41
  • 2