I want to convert a HTML file with a table based layout to plaintext in order to send a multipart email via PHP.
I have tried a few different pre built classes / functions that I've found on SO, but none of them seem to produce decent results, which I believe is down to the table-based layout.
I don't want to roll my own class for stripping HTML and formatting the results as I am sure there are edge issues which I won't account for or be able to test until I come across them in production.
The best solution I've come up with so far is:
- Create a temporary HTML file
- Use something like shell_exec("/path/to/lynx -dump temporary.html"); to create a plaintext version of the email
- Use some regex to get rid of any remaining unwanted tags
This works fine, but I'm a little worried that its not the optimal way of achieving a decent multipart email. Is anyone aware of a better way?
To clarify, I have already tried the following without success:
- html2text class - http://www.chuggnutt.com/html2text.php
- Markdownify - http://milianw.de/projects/markdownify/
- html2text version 2 - http://www.howtocreate.co.uk/php/html2texthowto.html
- http://journals.jevon.org/users/jevon-phd/entry/19818