0

Here I hve seven pdf files in one folder which is saved in their Invoice No value. For example my pdf looks like belowenter image description here

Bil-to Customer No. is the Delear Code. Ive Connect to ms access db and able to fetch email id and Delear Code. This Code s differs in each pdf. Nw my task is to search this Delear Code in all pdf files and attach the corresponding email id. Db content s as follows

STE002 a@gmail.com
C04004 a@gmail.com
RS0002 b@gmail.com
RS0006 b@gmail.com
RS0009 c@gmail.com
RS0001 c@gmail.com
C01020 d@gmail.com

My email is as follows.

Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
    String url = "jdbc:odbc:PDF1";
    Connection con = DriverManager.getConnection(url);
     java.sql.Statement st = con.createStatement();

    String sql = "SELECT * FROM new";    // Retrieve data from Person table in database
    ResultSet rs = st.executeQuery(sql);

    while(rs.next()){


    String code = rs.getString("Dealer Code");
    String email = rs.getString("Dealer Email ID");

    System.out.println(+ code + " " + email);  

    //email


        String to = email;

      String from = "abcd.gmail.com";

      final String username = "abcd.gmail.com";//change accordingly
      final String password = "*******";//change accordingly

      // Assuming you are sending email through relay.jangosmtp.net
      String host = "smtp.gmail.com";

      Properties props = new Properties();
      props.put("mail.smtp.auth", "true");
      props.put("mail.smtp.starttls.enable", "true");
      props.put("mail.smtp.host", host);
      props.put("mail.smtp.port", "25");

      // Get the Session object.
      Session session = Session.getInstance(props,
         new javax.mail.Authenticator() {
            protected PasswordAuthentication getPasswordAuthentication() {
               return new PasswordAuthentication(username, password);
            }
         });

      try {
         // Create a default MimeMessage object.
         Message message = new MimeMessage(session);

         // Set From: header field of the header.
         message.setFrom(new InternetAddress(from));

         // Set To: header field of the header.
         message.setRecipients(Message.RecipientType.TO,
            InternetAddress.parse(to));

         // Set Subject: header field
         message.setSubject("Testing Subject");

         // Create the message part
         BodyPart messageBodyPart = new MimeBodyPart();

         // Now set the actual message
         messageBodyPart.setText("This is message body");

         // Create a multipar message
         Multipart multipart = new MimeMultipart();

         // Set text message part
         multipart.addBodyPart(messageBodyPart);

         // Part two is attachment
         messageBodyPart = new MimeBodyPart();

         String filename = "E:\\Sales.pdf";

         DataSource source = new FileDataSource(filename);
         messageBodyPart.setDataHandler(new DataHandler(source));
         messageBodyPart.setFileName(filename);
         multipart.addBodyPart(messageBodyPart);

         // Send the complete message parts
         message.setContent(multipart);

         // Send message
         Transport.send(message);

         System.out.println("Sent message successfully....");
User9999
  • 95
  • 1
  • 3
  • 9

1 Answers1

1

Try to use Apache PDFBox Here is text extraction tutorial

To find pdf files use listFiles(FileFilter filter) Here is example of this method:

private static String directoryPath = "/Users/aal/Documents";
private static String extension = "pdf";

public static void main(String[] args) {

    File file = null;
    File[] paths;

    try {
        file = new File(directoryPath);

        FileFilter fileFilter = new FileFilter() {
            @Override
            public boolean accept(File pathname) {
                return pathname.getName().endsWith(extension);
            }
        };

        // returns pathnames for files and directory
        paths = file.listFiles(fileFilter);

        for (File path : paths) {
            // prints file and directory paths
            System.out.println(path);
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

}
  • I tried PDFBox and later splitted on pdf content(i.e., Invoice No). but again i need to search in tht target folder – User9999 Aug 20 '15 at 08:06
  • https://stackoverflow.com/questions/7199911/how-to-file-listfiles-in-alphabetical-order – Tilman Hausherr Aug 20 '15 at 08:57
  • ya Im trying it, I can list the files. but hw can i search all the PDF file content for the Dealer COde (i.e., Bill-to Customer No. ) in that list...? – User9999 Aug 20 '15 at 09:43
  • @KiranP I answered that in the other question: Pattern.compile("Bill\\-to Customer No\\. ([A-Z0-9]+)"); and use that on the result of a text stripping like you did in the question https://stackoverflow.com/questions/32002830/how-to-split-pdf-file-by-result-in-java-pdfbox NOTE: I mean the question, not the answer. Your own question shows how to text extract a PDF as a whole. – Tilman Hausherr Aug 20 '15 at 10:47
  • i tried with this pattern is nt working. error is header page ??? " + page + " skipped" – User9999 Aug 20 '15 at 11:27
  • @KiranP sigh... I said the *question*, not the answer. PDFTextStripper stripper = new PDFTextStripper(); sb.append(stripper.getText(pd)); Pattern p = Pattern.compile("Bill\\-to Customer No\\. ([A-Z0-9]+)");); Matcher m = p.matcher(sb); if (m.find()) bcno = m.group(1). And then check whether bcno is the one you want. Don't forget to close the document. – Tilman Hausherr Aug 20 '15 at 11:56
  • I tried with ur code. Nw Im able to build successfully, but Bill-to Customer No is nt returning. coming empty value – User9999 Aug 20 '15 at 12:15
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/87481/discussion-between-tilman-hausherr-and-kiran-p). – Tilman Hausherr Aug 20 '15 at 12:18