Python Email: When trying to get the HREF link value, the value does not save with equal signs (=)

Question

Specifically I am looking for an email with the subject 'Your booking has been confirmed!' and I am trying to click the link in the body of the email. My code usually runs almost immediately after the confirmation email is sent, I am going to optimize it further to open the first email with this subject line.

Code is shown below. The hyperlink I am trying to get contains '=', but when printing or returning it, the program seems to completely remove the equal (=) signs. Apparently I need to write more to be able to post this.

I'm not sure how else to write it but let's say I have a attribute of

https://stackoverflow.com/php.?i=857398425237459"> My "value" in the code will return "https://stackoverflow.com/php.?i857398425237459." This making it impossible for me to be able to properly obtain the link information for future use.

import imaplib
import email
import quopri
import HTMLParser
import time
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup


class parseLinks(HTMLParser.HTMLParser):
    def handle_starttag(self, tag, attrs):
        global global_futures_fair_value
        if tag == 'a':
            for name, value in attrs:
                if name == 'href':
                    #print name
                    print value.type()
                    linkList.append(value)

def gmailLogin(username, password):

    M = imaplib.IMAP4_SSL('imap.gmail.com')

    M.login(username, password)
    M.select('Inbox')

    rv, data = M.search(None, 'ALL')
    mail_ids = data[0]
    id_list = mail_ids.split()
    latest_email_id = int(id_list[-1])
    typ, msg_data = M.fetch(latest_email_id, '(RFC822)')

    msg = email.message_from_string(msg_data[0][1])
    msg = str(msg.get_payload()[1])
    msg = quopri.decodestring(msg)

    linkParser = parseLinks()
    linkParser.feed(msg)
    M.close()
    M.logout()
    print linkList[0]
    return str(linkList[0])

linkList = []
browser = webdriver.Chrome()
answer = gmailLogin('USERNAME','PASSWORD')
browser.get(answer)

The character `=` is used by the quoted-printable encoding, so `quopri.decodestring` is probably what's eating it. Do you need to decode any other quoted-printable text in the message? Is the email actually in quoted-printable in the first place? — Daniel Pryden, Oct 05 '18 at 18:19
Yeah that was the mistake! The email is not in quoted-printable so I just removed that action. Thank you for your help! — Phython, Oct 05 '18 at 18:28

score 0 · Answer 1 · answered Oct 05 '18 at 18:18

0

I figured it out, it had to do with quopri and how it encodes the message. Seen here: How to understand the equal sign '=' symbol in IMAP email text?

answered Oct 05 '18 at 18:18

Phython

23
3

Python Email: When trying to get the HREF link value, the value does not save with equal signs (=)

1 Answers1