0

I am writing a simple imap client with Perl's Net::IMAP::Simple module. I'd like to print subjects of messages on standart output. The subject is encoded in Quoted-Printables, so I have to decode it with MIME::QuotedPrint decode_qp() function. Everything is printed fine, except whitespaces, they remain encoded and I have no idea why. The output looks like this now:

[073] =?UTF-8?Q?[Myawesome_subject_topic]?= =?UTF-8?Q?_Сообщение?= =?UTF-8?Q?_номер?=

As you can see, whitespaces are locaed between ?= and =?UTF-8?Q?_ 'tags'. Not sure how to deal with them. The code for the relevant part is below

my $nm = $imap->select('INBOX');
for (my $i = 1; $i <= $nm; $i++) {
if ($imap->seen($i)) {
    print '*';
} 
else {
    print " ";
}
my $es = Email::Simple->new(join '', @{ $imap->top($i) } );
my $decoded = $es->header('Subject');
$decoded = decode_qp($decoded);
printf("[%03d] %s\n", $i, $decoded);
}

UPDATE AND SOLUTION

  1. Use Encode module instead of MIME::QuotedPrint

    use Encode qw(decode);

  2. Decode subject like this

    $decoded = decode("MIME-Header", $encoded);

additional info on the topic in the accepted answer below

Ivan
  • 163
  • 1
  • 12

1 Answers1

1

You cannot simple decode the full subject value with quoted-printable since not the full subject is encoded. If you have something like

 Subject: =?UTF-8?Q?AAAAAAAA?=   =?UTF-8?Q?BBBBBBBB?=

you have to take each of =?CHENC?Q?ENCODED?= separately, decode the ENCODED part as quoted-printable and then interpret the result depending on the character encoding CHENC (i.e. UTF-8 in your specific case). After this is done replace the whole =?...?= part with the decoded data.

For the exact details see RFC 2047. For an existing implementation in Perl see for example Encode::MIME::Header. See also Decode an UTF8 email header.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • You live - you learn. Solution via the link you provided works perfectly. Thanks. Gotta read rfc for email specifications – Ivan Feb 01 '18 at 17:40