0

I am working on piece of code to get a value from gmail, but email itself is HTML File, so code is also returning me html code within list, for which I am unable to parse data.

My Code:

import imaplib

ORG_EMAIL = "comapnyname.com"
FROM_EMAIL = "automation@companyname.co"
FROM_PWD = "password123!"
SMTP_SERVER = "imap.gmail.com"


def read_email_from_gmail():

    mail = imaplib.IMAP4_SSL(SMTP_SERVER)
    mail.login(FROM_EMAIL, FROM_PWD)
    mail.select("inbox")

    email_type, data = mail.search(None, "ALL")
    mail_ids = data[0]
    id_list = mail_ids.split()

    latest_email_id = int(id_list[-1])

    email_type, data = mail.fetch(str.encode(str(latest_email_id)), "(RFC822)")

    string_data = str(data)
    print('MAIL Data: ')
    print(string_data)



read_email_from_gmail()

Now This code is returning me long list which contains HTML

[(b'1 (RFC822 {54624}', b'Delivered-To: automation+qa1@spekit.co\r\nReceived: by 2002:a4a:6f04:0:0:0:0:0 with SMTP id h4csp1519301ooc;\r\n        Thu, 10 Sep 2020 09:18:42 -0700 (PDT)\r\nX-Google-Smtp-Source: ABdhPJy/7yOn17HKdn+QjP0XHEOK2fu8LDL8tz4jDmDKemms2GVyykqDCDUfppmRbV4DUi7ckRRg\r\nX-Received: by 2002:a25:d7cd:: with SMTP id o196mr14075369ybg.91.1599754722247;\r\n        Thu, 10 Sep 2020 09:18:42 -0700 (PDT)\r\nARC-Seal: i=1; a=rsa-sha256; t=1599754722; cv=none;\r\n        d=google.com; s=arc-20160816;\r\n        b=KzNg7bsmLaNcrRMihkN+AwlTp8ybj5D65K+Z21Ddl/lgd2LN90InAWhj+guhrmzHtB\r\n         vw83T4AlJ8u2jpAs5qYUbxgd/R5COLhlRDqR/dE4wljRgIq2W6sVCJo/fGuZruFjob4Z\r\n         h1acPat0xa3h83lJzzbH576KggTqdScMwCbLsujPr/FclnHNjkqxQuFQlV23nAGgvWX8\r\n         raiIW+6wC070tmQaaz3feIVfo7r7cmQBGokOmy8B3of0/kqIyMVuaEkmk2kno8VFvILF\r\n         i8YPq7bOHVNpre7KwiG4r69PdaDRXIcd/ETtuyusfNXOrGJ0QhC44j2eLUpxlRltOGgL\r\n         NAeA==\r\nARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816;\r\n        h=mime-version:date:message-id:to:subject:from:dkim-signature\r\n         :dkim-signature;\r\n        bh=ZNxh0gTg5kVpAZyTHGJ2jWADa5UGAoPCP3GFX1DUu94=;\r\n        b=WjnIWwVX2oWrl3aZoKlzck1GAoy/gT5/cbNP+tnmdypfjvAUTyuZ3OO5xXlZB/CiF9\r\n         PkYZFEzJQSxradr3ky5T7tLmV2qKnHfaIp3G3STUs5f9vhSfp6qknV7ouLBGwCWyp2gp\r\n         e14Aek7M5ciVC1GIjxlr7AXZne4eHSwCb7u8j91Yt8B2getEQ9lyQlChwjYf38Kau5lL\r\n         wPmMtAM0DDOqlNff2gTBEFgAX1s0Wk+g8mKS31tzBMIQvayR+a3PHX+S3zhtC2i1XsLm\r\n         NOWSMsI0ZEEk/mjA36DVWhEN0d9llOwiDfFonXxIkcPZLlNR3zGfA61apTeud7i24vYn\r\n         bfCw==\r\nARC-Authentication-Results: i=1; mx.google.com;\r\n       dkim=pass header.i=@spekit.co header.s=mandrill header.b=RhjFdk+T;\r\n       dkim=pass header.i=@mandrillapp.com header.s=mandrill header.b=SusUoY2S;\r\n       spf=pass (google.com: domain of bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com designates 198.2.180.17 as permitted sender) smtp.mailfrom=bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com;\r\n       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=spekit.co\r\nReturn-Path: <bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com>\r\nReceived: from mail180-17.suw31.mandrillapp.com (mail180-17.suw31.mandrillapp.com. [198.2.180.17])\r\n        by mx.google.com with ESMTPS id t10si6240908ybl.463.2020.09.10.09.18.42\r\n        for <automation+qa1@spekit.co>\r\n        (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);\r\n        Thu, 10 Sep 2020 09:18:42 -0700 (PDT)\r\nReceived-SPF: pass (google.com: domain of bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com designates 198.2.180.17 as permitted sender) client-ip=198.2.180.17;\r\nAuthentication-Results: mx.google.com;\r\n       dkim=pass header.i=@spekit.co header.s=mandrill header.b=RhjFdk+T;\r\n       dkim=pass header.i=@mandrillapp.com header.s=mandrill header.b=SusUoY2S;\r\n       spf=pass (google.com: domain of bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com designates 198.2.180.17 as permitted sender) smtp.mailfrom=bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com;\r\n       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=spekit.co\r\nDKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=spekit.co;\r\n h=From:Subject:To:Message-Id:Date:MIME-Version:Content-Type; i=support@spekit.co;\r\n bh=ZNxh0gTg5kVpAZyTHGJ2jWADa5UGAoPCP3GFX1DUu94=;\r\n b=RhjFdk+Tvr3HP43qJoKzVowGAs1SYJFfpq8MK4firz5tcpBYn3UEP/Z5cF+IBA74/PTmCahgTnXi\r\n   /EPSbY2b+20ERj4s4VUnwNZw8t4L98gSQiM6o3mF4iVI2JIgABU2Tn2nmB68kGZyxeSOs4bWtE+s\r\n   MXleLzg+uTftETJoUhM=\r\nReceived: from pmta03.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail180-17.suw31.mandrillapp.com id hb98u422sc0h for <automation+qa1@spekit.co>; Thu, 10 Sep 2020 16:18:42 +0000 (envelope-from <bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com>)\r\nDKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; \r\n i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1599754721; h=From : \r\n Subject : To : Message-Id : Date : MIME-Version : Content-Type : From : \r\n Subject : Date : X-Mandrill-User : List-Unsubscribe; \r\n bh=ZNxh0gTg5kVpAZyTHGJ2jWADa5UGAoPCP3GFX1DUu94=; \r\n b=SusUoY2SOQosSQrzHafHGf7Pto1Ol3PDGU067dNsjT1ZIOuSP0Dz7DJwqgFn6NpwAV7X7e\r\n pzQQPyDJoAqQCjCdSqG9mp80hAEGwQC89GNu78a8o0NRC+BPRTGaNKV/jX06cXsgp+A4KXfY\r\n 13x1BInjKraTnCYz9TnzDUChIm3pg=\r\nFrom: Support <support@spekit.co>\r\nSubject: Your Spekit Login PIN\r\nReturn-Path: <bounce-md_31064008.5f5a51e1.v1-8084cafe0c6c4aeca73fef8bdaf5b70b@mandrillapp.com>\r\nReceived: from [3.128.246.0] by mandrillapp.com id 8084cafe0c6c4aeca73fef8bdaf5b70b; Thu, 10 Sep 2020 16:18:41 +0000\r\nTo: Automation <automation+qa1@spekit.co>\r\nX-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com\r\nX-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31064008.8084cafe0c6c4aeca73fef8bdaf5b70b\r\nX-Mandrill-User: md_31064008\r\nMessage-Id: <31064008.20200910161841.5f5a51e1e2be13.10518479@mail180-17.suw31.mandrillapp.com>\r\nDate: Thu, 10 Sep 2020 16:18:41 +0000\r\nMIME-Version: 1.0\r\nContent-Type: multipart/alternative; boundary="_av-l5kOy35rlKJaV18wYlOHPA"\r\n\r\n--_av-l5kOy35rlKJaV18wYlOHPA\r\nContent-Type: text/plain; charset=utf-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n        Your Spekit Login PIN                                              \r\n                            Hi Automation, Someone (hopefully you) just\r\nlogged into your Spekit account with the email *automation+qa1@spekit.co*. \r\n \r\n If this was you, please use the code below to log-in, otherwise please\r\ncontact your admin and reset your password ASAP.\r\n   =3D *952681* =3D\r\n\r\n                                Enter PIN <https://app.spekit.co/verifypin>\r\n<http://www.twitter.com/spekitapp>\r\n<https://www.linkedin.com/company/spekit/> <https://medium.com/spekit>\r\n<https://spekit.co/>                                               \r\nQuestions? Contact us. <mailto:support@spekit.co>\r\n Copyright =C2=A9 2018 Spekit, Inc. All rights reserved.\r\n\r\n--_av-l5kOy35rlKJaV18wYlOHPA\r\nContent-Type: text/html; charset=utf-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n<!doctype html>\r\n<html xmlns=3D"http://www.w3.org/1999/xhtml" xmlns:v=3D"urn:schemas-microso=\r\nft-com:vml" xmlns:o=3D"urn:schemas-microsoft-com:office:office">\r\n    <head>\r\n        <!-- NAME: 1 COLUMN - FULL WIDTH -->\r\n        <!--[if gte mso 15]>\r\n        <xml>\r\n            <o:OfficeDocumentSettings>\r\n            <o:AllowPNG/>\r\n            <o:PixelsPerInch>96</o:PixelsPerInch>\r\n            </o:OfficeDocumentSettings>\r\n        </xml>\r\n        <![endif]-->\r\n        <meta charset=3D"UTF-8">\r\n        <meta http-equiv=3D"X-UA-Compatible" content=3D"IE=3Dedge">\r\n        <meta name=3D"viewport" content=3D"width=3Ddevice-width, initial-sc=\r\nale=3D1">\r\n        <title>Your Spekit Login PIN</title>\r\n        \r\n    <style type=3D"text/css">\r\n=09=09p{\r\n=09=09=09margin:10px 0;\r\n=09=09=09padding:0;\r\n=09=09}\r\n=09=09table{\r\n=09=09=09border-</tbody></table>                                        ')']

I need to get value i.e. '952681', which is displaying twice, can someone help me there?

Taimoor Pasha
  • 192
  • 2
  • 3
  • 16

1 Answers1

1

if the format of the email stays the same you can use regex to parse the returned html string:

import re

pattern = '\*([\s\S]*?)\*'
res = re.findall(pattern, your_email_text)

the variable res contains your number at the second position:

['automation+qa1@spekit.co', '952681']
flipSTAR
  • 579
  • 1
  • 4
  • 18