28

Currently I'm using the PEAR library's mimeDecode.php for parsing incoming emails. It seems to have a lot of issues and fails to decode a lot of messages, so I'd like to replace it with something better.

I'm looking for something that is able to properly separate parts of the message, such as to, from, body, etc. Ideally it would be able to handle all common encoding methods such as base64, uuencode, quoted printable, etc.

In situations where both plain text and html versions of the same message are contained in a single email, I would ideally like it to know the difference between them so I could choose which part I wished to display.

I'm not worried about attachments at this point in time, but it would be nice for it to have knowledge of them in case I want to implement that in the future.

I saw PHP has a group of functions that start with the word imap that appear they may do what I would like, but I am unsure without trying them out.

Currently I am doing on-the-fly decoding of the messages in PHP, which is why I am looking for a PHP replacement solution.

Does anyone have an experience with this that could point me in the right direction? I'd hate to start using something that would end up not doing what I need in the long run.

Sgraffite
  • 1,898
  • 3
  • 22
  • 29

6 Answers6

13

I have recently developed a PHP mail parser and I have been using it on production.
I have very happy with it and some developers has forked it:

https://github.com/plancake/official-library-php-email-parser

Dan
  • 15,948
  • 20
  • 63
  • 92
  • 1
    It doesn't handle attachments well - it has the base64 encoded attachments stuff inside the HTML body. And has no `getAttachment()` kind of functions at all. – Slawa Oct 04 '12 at 13:55
  • 1
    Thanks for the bug reporting, Slawa - I will look into it. If you need to extract the attachment, I suggest you try http://code.google.com/p/php-mime-mail-parser/ – Dan Oct 04 '12 at 17:17
  • absolutely awesome library - perfect for what i needed – ChicagoSky Nov 29 '14 at 19:26
  • 1
    It is awesome but it turns out that it can't handle more complex mail structure. I've found a situation where an email has one boundary value to separate an attachment from the text/html body and then a different boundary value to split off text and html body parts... That's just not handled. – phoenix Aug 15 '17 at 21:57
8

I know this question's four years old now... but I ended up in need of a mail parsing library and wasn't satisfied with any of the available options. I wanted something reliable, PSR-2 compliant, installable via composer.

composer require zbateson/mail-mime-parser

It's its own parser, built from the ground up to get around known issues and bugs in other implementations. It is extensively tested and quite widely used.

The library makes use of Psr7 streams which allow you to pass it any kind of stream you like. It also doesn't store all information in memory -- very large attachments can be returned as a stream instead of a string if so desired, so memory isn't used up. Similarly the entire message is never stored directly in memory, only references to streams, and headers are kept in-memory.

https://github.com/zbateson/mail-mime-parser

Check out the website for a guide and the API... and if you find bugs/typos or see improvements, please feel free to open an issue, or dig right in and contribute with a pull request :)

zbateson
  • 1,044
  • 10
  • 11
6

Funny you should ask... Im actually working on a simple notification system now. I just finished up the Bounce Manager with i use Zend_Mail to implement. It has pretty much all the features you're looking for... you can connect to a mailbox (POP3, IMAP, Mbox, and Maildir) and pull messages from it as well as operate on all those messages.

It handles multipart messages, but the parts can be hard to work with. I had a hard time figuring out which part was the attached original message part in the NDR's I was working with, but I have a feeling I just missed something in the documentation. I'm not sure how it handles encoding, because my usage was fairly simple but I'm pretty sure it has provisions for all the encodings you mentioned. Check out the docs and browse the API.

prodigitalson
  • 60,050
  • 10
  • 100
  • 114
  • Do you know if it is possible to use Zend_Mail without the storage connector? I'd like to pass it an incoming message as a string and be able to use the methods associated to messages on it without it needing to have come from a storage location. – Sgraffite Jan 18 '11 at 17:53
  • 2
    Yes Im sure there is a way because this same class is used to send messages with the mailer/transport classes as well and in that case you would always be constructing a message form strings/files. If i recall it looks something like `$m = new Zend_Mail_Message(array('raw' => $stringMessage));` Take a look at the actual class and the doc comments for the constructor to verify. – prodigitalson Jan 18 '11 at 20:14
  • 1
    This ended up working out for me. Zend did a few things that I didn't understand why however. Zend will throw an exception when it does not recognize a header. In my case, I don't care about unrecognized headers, so I ended up commenting out that exception. Also there is a function where Zend does a foreach() on $parts, but sometimes the variable it is trying to foreach on is null, so I added a null check and return $res if it is null there. – Sgraffite Jan 26 '11 at 01:17
  • 1
    Finally when it is checking mime boundaries, it throws an exception if it can't find the closing boundary. In my case it was a malformed message, but the body was still readable, so I ended up commenting out that exception also. I'd rather give the user a malformed body than nothing. – Sgraffite Jan 26 '11 at 01:17
  • Hmm id dint run in to any problems with headers and i was actually using custom headers for things (like X-CUSTOMNS-CUSTOMNAME). It will however throw an exception if you try to read a header that doesnt exist.. you must use `$msg->hasHeader($header)` personally i would rather it retun null, false or -1 instead of having to explicitly test... – prodigitalson Jan 26 '11 at 01:50
  • I was only parsing incoming messages, maybe that is the difference? It was looping through all the headers and checking them with a case statement, if it hit default: it would throw an exception. – Sgraffite Jan 26 '11 at 02:06
  • Hmm.. odd... I was doing incoming messages as well (for outgoing i use swift mailer) and never had an issue with an custom headers... Of course the custom headers were in a an attached message (ie. mail part)... so i wasnt reading the custom headers in the top level message. – prodigitalson Jan 26 '11 at 02:09
  • It was literally 3 messages out of 11.2k total messages that ended up throwing that exception, so probably not very common. – Sgraffite Jan 26 '11 at 02:14
  • If anyone comes around here, the proper way to get a header is to check if it exists with the `headerExists()` method and if yes, fetch it with `getHeader()` – Gabriel S. Aug 10 '12 at 08:34
  • I don't you can still parse RAW email in 2014 with ZF2 – QuantumHive May 30 '14 at 14:50
  • The link is broken, here the working link https://docs.zendframework.com/zend-mail/ – Nicolas Mar 16 '18 at 15:42
4

I forked the php-mime-mail-parser to correct all the issues : Fork of php-mime-mail-parser

More than 52 tests and 764 assertions Code Coverage : 100% lines, 100% Functions and Methods, 100% Classes and Traits

You need the PECL Package MailParse to use it but the wrapper is without issue and fully tested.

eXorus
  • 93
  • 1
  • 1
  • 7
2

For completeness here's the one I'm going to try. http://code.google.com/p/php-mime-mail-parser/ - it's a wrapper around PHP MailParse, which needs to be installed.

Slawa
  • 1,141
  • 15
  • 21
1

I'm currently also on the lookout for an easy to use, robust MIME email parsing library and am currently seriously looking into Mail component from eZ Components. But, if you're looking for something that will make it as easy as echo $email->text; or echo $email->html;, like I was, you'll be disappointed. Actually, now I don't think such simplification is even possible, due to the way MIME works. But it does seem like the best option out there in the PHP world.

I started working on my current project with Zend_Mail component, but when the time came to actually dig inside those email parts and encoded headers, Zend_Mail pretty much leaves you out in the cold. You need to do most decoding yourself, which is not fun at all.

As for IMAP PHP extension, its meant to deal with retrieving messages from your mailbox, not MIME decoding them. Although, it does have some handy decoding function that you might need. Mailparse PECL extension, on the other hand, deals with exactly that problem set. I haven't tried it yet, but it seems like you need to write a lot of code to actually get to the data you want.

nnc
  • 1,003
  • 2
  • 8
  • 9
  • That looks decent by looking at the docs. I already put the hours in for implementing and testing the Zend_Mail library, and it appears to work pretty well. I honestly can't spend more time at work looking into a new library at this point. Thanks for the response though :) – Sgraffite Jan 29 '11 at 06:20