0

I am having a java application which processes a gmail inbox for some user replies, processes these replies and put them into database. I am currently facing a problem where I want to identify the user signatures in the email content, trim them off and store rest of the content in the database.

I am reading the email into a MimeMessage, getting content from that and processing it.

Is there any way to trim off the signatures from MimeMessage content or any header which can tell me that the email has user signature and the boundary from where it starts?

I have googled it out but found nothing on this. Any help would be greatly appreciated!. Thanks :)

Nisha Goyal
  • 37
  • 3
  • 10

2 Answers2

0

Visit http://javamail-crypto.sourceforge.net/. It's an API addition to Sun's JavaMail API which provides simple encryption and decryption of emails using S/MIME and/or OpenPGP.

0

I just had this problem myself. Since I couldn't find much information on this issue I gonna post my answer here, even though this question is quite old.

I used the code from https://stackoverflow.com/a/34689614/4001577 to retreive the Message as HTML.

Sadly, there was no marker which would tell me where the signature starts since its basically automatically added content by the mailing software.

What I did was basically following:

  • Look for an anchor containing xing or LinkedIn as Url (since all of our signatures contain the company's social media profiles)
  • get the index of that element
  • remove every element from the body after that element, itself included
private static Element trimSignature(final Element body) {
    final Elements anchors = body.getElementsByTag("a");
    Element signatureAnchor = null;
    for (Element anchor : anchors) {
        if(anchor.attr("href").contains("xing.com/companies")) {
            signatureAnchor = anchor;
            break;
        }
    }
    final Integer signatureElemIndex = signatureAnchor.elementSiblingIndex();
    final Elements children = body.children();
    for(int i = signatureElemIndex; i < children.size(); i++) {
        children.get(i).remove();
    }
    return body;
} 
Tim Visée
  • 2,988
  • 4
  • 45
  • 55
Patrick M
  • 113
  • 1
  • 7