32

Is anyone familiar with a Java library that helps with parsing the fields (date, subject, from, to) of the email below?

Message-ID: <19815303.1075861029555.JavaMail.ss@kk>
Date: Wed, 6 Mar 2010 12:32:20 -0800 (PST)
From: someone@someotherplace.com
To: someone@someplace.com
Subject: some subject
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: one, some <some.one@someotherplace.com>
X-To: one
X-cc: 
X-bcc: 
X-Folder: Bob\Inbox
X-Origin: Bob-R
X-FileName: rbob (Non-Privileged).pst


some message
Kareem
  • 1,026
  • 3
  • 10
  • 15

3 Answers3

37

JavaMail is an oracle library that provides mail services and mail related services (like parsing conventional & MIME messages) in the javax.mail package. Additionally Apache has a Commons Email library for mail handling.

In the JavaMail api, a simple way to parse a string containing an email message (which may or may not be explicitly MIME) would be as follows

String content = ...
Session s = Session.getInstance(new Properties());
InputStream is = new ByteArrayInputStream(content.getBytes());
MimeMessage message = new MimeMessage(s, is);

and parsing the headers could be done like this

message.getAllHeaderLines();
for (Enumeration<Header> e = message.getAllHeaders(); e.hasMoreElements();) {
    Header h = e.nextElement();
    h.getName();
    h.getValue();
}
Yasin Okumuş
  • 2,299
  • 7
  • 31
  • 62
Jherico
  • 28,584
  • 8
  • 61
  • 87
  • But how do I parse the "From:" header, and separate between the names and the email addresses – Hendy Irawan Mar 25 '12 at 16:04
  • 1
    @HendyIrawan there is a getFrom() method on the Message class. It returns an Address[] type, and the members of that array can typically be cast to an InternetAddress type if you're dealing with actual email messages. This class has methods for getting the name and email portions of an RFC822 compliant address. – Jherico Mar 27 '12 at 19:08
  • this works great ! however, how can i read an email thread which has multiple emails ? i tried the approach above but it only reads the first email. also, how can i extract just the message body ? – AbtPst Mar 02 '15 at 18:34
  • You would need to parse each one of the messages and create your own in-memory structure that represents the conversation thread. The thread can generally be reconstructed by looking at the 'In-Reply-To:' header of a given message, which will contain the Message-ID: header of the email to which it was a response. – Jherico Mar 02 '15 at 19:45
  • For the rest, I suggest you try doing some experimentation and if you get stuck you can come back and either find a more relevant question and answer, or ask a question of your own. – Jherico Mar 02 '15 at 19:46
10

I have had problems with JavaMail (it fails to parse some email messages that it should).

I have had much better results with Mime4J.

Adam Gent
  • 47,843
  • 23
  • 153
  • 203
  • - Have you noticed it failing for mime messages generated with any specific application or platform? – Brill Pappin Nov 23 '13 at 09:08
  • @AdamGent. There are quite a few [system properties](https://docs.oracle.com/javaee/6/api/javax/mail/internet/package-summary.html) that you have to set in order to get JavaMail's mime parser to be a bit more 'relaxed'. Otherwise it acts in 'strict mode'. – peterh Apr 10 '18 at 05:07
7

I would suggest you use email-mime-parser,

Following sample code gives you all the relevant info you need:

ContentHandler contentHandler = new CustomContentHandler();

MimeConfig mime4jParserConfig = new MimeConfig();
BodyDescriptorBuilder bodyDescriptorBuilder = new DefaultBodyDescriptorBuilder();
MimeStreamParser mime4jParser = new MimeStreamParser(mime4jParserConfig,DecodeMonitor.SILENT,bodyDescriptorBuilder);
mime4jParser.setContentDecoding(true);
mime4jParser.setContentHandler(contentHandler);

InputStream mailIn = 'Provide email mime stream here';
mime4jParser.parse(mailIn);

Email email = ((CustomContentHandler) contentHandler).getEmail();

List<Attachment> attachments =  email.getAttachments();

Attachment calendar = email.getCalendarBody();
Attachment htmlBody = email.getHTMLEmailBody();
Attachment plainText = email.getPlainTextEmailBody();

String to = email.getToEmailHeaderValue();
String cc = email.getCCEmailHeaderValue();
String from = email.getFromEmailHeaderValue();
Ashish Sharma
  • 1,124
  • 2
  • 24
  • 49