-2

I am trying to read a specific email from my mailbox. And I want to click on 'Click here' hyperlink to start downloading the excel file on my laptop. I am trying below code:

import smtplib
import time
import imaplib
import email
import traceback 

ORG_EMAIL   = "@gmail.com"
FROM_EMAIL  = "myemail" + ORG_EMAIL
FROM_PWD    = "password"
SMTP_SERVER = "imap.gmail.com"
SMTP_PORT   = 993

def read_email_from_gmail():
    try:
        mail = imaplib.IMAP4_SSL(SMTP_SERVER)
        mail.login(FROM_EMAIL,FROM_PWD)
        mail.select('inbox')

        data = mail.search(None, 'ALL')
        mail_ids = data[1]
        id_list = mail_ids[0].split()   
        first_email_id = int(id_list[0])
        latest_email_id = int(id_list[-1])

        for i in range(latest_email_id,first_email_id, -1):
            data = mail.fetch(str(i), '(RFC822)' )
            for response_part in data:
                arr = response_part[0]
                if isinstance(arr, tuple):
                    msg = email.message_from_string(str(arr[1],'unicode_escape'))
                    email_subject = msg['somesubject']
                    email_from = msg['igotemailfrom@something.com']
                    # print('From : ' + email_from + '\n')
                    # print('Subject : ' + email_subject + '\n')
    except Exception as e:
        traceback.print_exc() 
        print(str(e))
read_email_from_gmail()

Can someone please help on how can I just click on the link 'Click here to download data' from email I am fetching?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
RCN
  • 101
  • 10
  • I think you can use regex or a parser library to get link from extracted message and then use a library to download a file from it. – Metin Usta Dec 21 '21 at 18:06
  • Thank You for your response!! Sorry I am new to this.. do you have any reference that I can go through? – RCN Dec 21 '21 at 18:10
  • 1
    Extracting links from a text: https://stackoverflow.com/a/840110/11560290 Downloading an excel file: https://stackoverflow.com/questions/25415405/downloading-an-excel-file-from-the-web-in-python – Metin Usta Dec 21 '21 at 18:43
  • What do you mean with 'clicking' on a link? You can extract the link from the message, but what do you want to do with it? – RJ Adriaansen Dec 21 '21 at 19:01
  • I mean that I want to follow the link to download data. – RCN Dec 21 '21 at 19:07

2 Answers2

1

You can use imap tools https://pypi.org/project/imap-tools/#id7

It is so easy to use, short and crisp:

pip install imap-tools

from imap_tools import MailBox
from imap_tools import AND, OR, NOT

# get list of email bodies from INBOX folder
with MailBox('imap.gmail.com').login('email', 'pwd', 'INBOX') as mailbox:
    bodies = [msg.html for msg in mailbox.fetch(AND(subject='your email subject'), reverse = True)]
   
from bs4 import BeautifulSoup

soup = BeautifulSoup(str(bodies))
links = []
for link in soup.findAll('a', attrs={'href': re.compile("^https://")}):
    links.append(link.get('href'))
links # in this you will have to check the link number starting from 0. 

result = links[5] # mine was 5th
result

import requests
resp = requests.get(result)

output = open('test.csv', 'wb')
output.write(resp.content)
output.close()
Ben Souchet
  • 1,450
  • 6
  • 20
RCN
  • 101
  • 10
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 30 '21 at 10:56
0

I highly recommend you get the html of the email and then pass it into BeutifulSoup which will then let you search for elements using selectors. Once you have that you can just get the link you are looking for, then you can send a request to the link to get the excel file.

I'm using the googleapiclient library, but hopefully my code snippet will give you an idea.

from bs4 import BeautifulSoup

string_returned = get_message_using_id(message_id)['payload']['body']['data']
encoded_bytes = bytes(string_returned, encoding='utf-8')
string_configured = base64.urlsafe_b64decode(encoded_bytes).decode('utf-8')
email_contents = html.unescape(string_configured)

soup = BeautifulSoup(email_contents, 'html.parser')
links = soup.select('a[href]')
# The links array contains all of the links in the email
link = links[0].text
# This will request the url of the link, then you can do anything with it.
response = requests.get(link)
Ryan Nygard
  • 200
  • 8