10

So I get emails using imap from gmail and outlook.

Gmail encodes like this =?UTF-8?B?UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==?= and outlook encodes like this =?iso-8859-7?B?UmU6IOXr6+ft6er8IHN1YmplY3Q=?=

Unfortunately I did not find yet any solution that will help me make this into readable text. Instead I am messing with:

mb_convert_encoding($body, "UTF-8", "UTF-8"); 

and

mb_convert_encoding($body, "UTF-8", "iso-8859-7");

but I am struggling to find a solution to solve this matter.

This is how I open the IMAP of my account (which has a lot of gmail and outlook messages)

$hostname = '{imappro.zoho.com:993/imap/ssl}INBOX';
$username = 'email@email.com';
$password = 'password';


/* try to connect */
$inbox = imap_open($hostname,$username ,$password) or die('Cannot connect to Zoho: ' . imap_last_error());

/* grab emails */
$emails = imap_search($inbox,'UNSEEN');

Any help?

EnexoOnoma
  • 8,454
  • 18
  • 94
  • 179

5 Answers5

5

Unfortunately I did not find yet any solution that will help me make this into readable text.

Solution Your strings are base64 encoded.

=?UTF-8?B?UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==?=

echo base64_decode('UmU6IM69zq3OvyDOtc68zrHOuc67IG5ldyBlbWFpbA==');

prints "Re: νέο εμαιλ new email"

=?iso-8859-7?B?UmU6IOXr6+ft6er8IHN1YmplY3Q=?=

echo base64_decode('UmU6IOXr6+ft6er8IHN1YmplY3Q=');

prints out "Re: subject"

The answer is to use base64_decode in conjunction with your current solutions.

The way to identify base64 encoded text is that it's depicted as letters a-z, A-Z, numbers 0-9 along with two other characters (usually + and /) and it's usually right padded with =.

EDIT:

Sorry, I was already forgetting that the question was to convert from iso-8859-7 to UTF-8 and have it visible.

<?php
$str = base64_decode('UmU6IPP03evt+SDs3u317OE=');
$str = mb_convert_encoding($str,'UTF-8','iso-8859-7');
echo $str;
?>

The result is "Re: στέλνω μήνυμα"

chicks
  • 2,393
  • 3
  • 24
  • 40
Altimus Prime
  • 2,207
  • 2
  • 27
  • 46
  • But how about this? `echo base64_decode('ZP3OUC66Z4ZOU86XZ4IGZP3OUC66ZR/OU86XZPDOTM63Z4I=');` It returns `d��P.�g�NSΗg�d��P.�e�SΗd��Lηg�` – EnexoOnoma Aug 09 '17 at 17:30
  • Not all data is text, and not all text data is single byte per character. Is that the complete string in one of the examples you've come across, or just an excerpt? Also, what what the character encoding that preceded it in your response? – Altimus Prime Aug 09 '17 at 17:35
  • This a similar example: `=?iso-8859-7?B?UmU6IPP03evt+SDs3u317OE=?=` – EnexoOnoma Aug 09 '17 at 19:01
  • If you are looking at it in your browser you'll need to set the charset header like `header('Content-Type: text/html; charset=iso-8859-7');` – Altimus Prime Aug 09 '17 at 19:26
  • I had made a mistaken note about it being arabic, but really its greek I think. – Altimus Prime Aug 09 '17 at 19:26
  • Somehow I thought that since you already knew to use mb_convert_encoding that base64 was all you still needed to know. – Altimus Prime Aug 10 '17 at 01:06
2

look here

   /* connect to gmail */
    $hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
    $username = 'davidwalshblog@gmail.com';
    $password = 'davidwalsh';

    /* try to connect */
    $inbox = imap_open($hostname,$username,$password) or die('Cannot connect to Gmail: ' . imap_last_error());

    /* grab emails */
    $emails = imap_search($inbox,'ALL');

    /* if emails are returned, cycle through each... */
    if($emails) {

        /* begin output var */
        $output = '';

        /* put the newest emails on top */
        rsort($emails);

        /* for every email... */
        foreach($emails as $email_number) {

            /* get information specific to this email */
            $overview = imap_fetch_overview($inbox,$email_number,0);
            $message = imap_fetchbody($inbox,$email_number,2);

            /* output the email header information */
            $output.= '<div class="toggler '.($overview[0]->seen ? 'read' : 'unread').'">';
            $output.= '<span class="subject">'.$overview[0]->subject.'</span> ';
            $output.= '<span class="from">'.$overview[0]->from.'</span>';
            $output.= '<span class="date">on '.$overview[0]->date.'</span>';
            $output.= '</div>';

            /* output the email body */
            $output.= '<div class="body">'.$message.'</div>';
        }

        echo $output;
    } 

    /* close the connection */
    imap_close($inbox);

for reading and decoding look here

<?php
$hostname = '{********:993/imap/ssl}INBOX';
$username = '*********';
$password = '******';

$inbox = imap_open($hostname,$username,$password) or die('Cannot connect to server: ' . imap_last_error());

$emails = imap_search($inbox,'ALL');

if($emails) {
    $output = '';
    rsort($emails);

    foreach($emails as $email_number) {
        $overview = imap_fetch_overview($inbox,$email_number,0);
        $structure = imap_fetchstructure($inbox, $email_number);

        if(isset($structure->parts) && is_array($structure->parts) && isset($structure->parts[1])) {
            $part = $structure->parts[1];
            $message = imap_fetchbody($inbox,$email_number,2);

            if($part->encoding == 3) {
                $message = imap_base64($message);
            } else if($part->encoding == 1) {
                $message = imap_8bit($message);
            } else {
                $message = imap_qprint($message);
            }
        }

        $output.= '<div class="toggle'.($overview[0]->seen ? 'read' : 'unread').'">';
        $output.= '<span class="from">From: '.utf8_decode(imap_utf8($overview[0]->from)).'</span>';
        $output.= '<span class="date">on '.utf8_decode(imap_utf8($overview[0]->date)).'</span>';
        $output.= '<br /><span class="subject">Subject('.$part->encoding.'): '.utf8_decode(imap_utf8($overview[0]->subject)).'</span> ';
        $output.= '</div>';

        $output.= '<div class="body">'.$message.'</div><hr />';
    }

    echo $output;
}

imap_close($inbox);
?>

Look here for great tutorial on email structure, and function to extract it.

BlooB
  • 955
  • 10
  • 23
0

If you want to decode header elements, there is a PHP function for that: imap_mime_header_decode().

Also, you will need some MIME parser class to decode multipart messages.

0

To get the headers, you would pass your stream ($inbox) to imap_headers(). There are lots of values you can get in the response, full list: imap_headerinfo

For the actual messages, plain text can be read using imap_body(), passing the stream and the number of the message you want (in $emails after your search). Getting an html/multipart email is a bit trickier. First you need imap_fetchstructure(), which identifies the parts of the message, then imap_fetchbody() to get the piece you are interested in.

Once you have a result from imap_fetchbody(), if you still need to adjust the encoding, it could be done at this point.

Zayn Ali
  • 4,765
  • 1
  • 30
  • 40
RelicScoth
  • 697
  • 4
  • 18
0

I had a task to receive letters from a certain mailbox, parse them and index certain content.

I wanted to have some microservice that would provide me with the data.

  1. Downloading the required content
  2. Convert the received data into a readable format
  3. process the content

So I decided to use ready-made tools.

  1. script for getting emails - imap2maildir
  2. Unix client for processing messages mu
  3. dos2unix converter

Next, I wrote a small bash script that I placed in cron

#!/bin/bash
python /var/mail_dump/imap2maildir/imap2maildir -c /var/mail_dump/imap2maildir/deploy.conf
mu index --maildir=/var/mail_dump/dumps/new
#clean old data
rm -rf /var/mail_dump/extract/*

#search match messages
mu find jivo --fields="l" --nocolor | xargs $1 cp -t /var/mail_dump/extract
#converting
dos2unix -f /var/mail_dump/extract/*

#reassembly of messages in html
cd /var/mail_dump/extract/
for i in /var/mail_dump/extract/*
do
  mu extract --parts=0 --overwrite "$i"
  rm "$i"
done

Complete ! I got a service that constantly receives emails and prepares them for processing. php work with the prepared data without thinking about the implementation of low-level logic.

Redr01d
  • 392
  • 2
  • 12