0

I have written code to login to the mail. How to get a response from the given link (https://nationalskillsregistry.com) in Gmail inbox?

import imaplib
import getpass
import email
import datetime

detach_dir = '.' # directory where to save attachments (default: current)
user = "something@gmail.com"
pwd = "password"
subject_filter='(SUBJECT "Daily News ")'

# connecting to the gmail imap server
m = imaplib.IMAP4_SSL("imap.gmail.com")
m.login(user,pwd)
print "logged in successfully..."
m.select()
typ, data = m.search(None, subject_filter)
for num in data[0].split():
    rv, data = m.fetch(num, '(RFC822)')
    if rv != 'OK':
      print "ERROR getting message", num
      #return

msg = email.message_from_string(data[0][1])
print msg.get_payload(decode=True)
m.close()
m.logout()

This is the mail which I have:

Subject : Daily News - Announcing

Body :

Kindly note, if you are making online payment you do not need to visit any POS centre. Your account will be immediately renewed. If your account is not renewed immediately then wait for 24 hours and check if the validity has been extended. Kindly do not make multiple online payments. Visit us at https://nationalskillsregistry.com.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
SRK
  • 21
  • 1
  • 9

1 Answers1

1

There are 2 steps you'll want to do -- extract the URL from the email, and then open it in a browser.

Step 1 will be the hard part. I'd recommend using Regular expressions to try and parse the email to pull out the URL. There are a lot of resources online that can help you with this. One of my favorites for testing regexs is RegExr.
The code should be pretty straightforward.

import re
 ...
expr = r'((http)s?:\/\/((\.)?\w+)+(\/\S*)*)'
#Parse with regex: Grabs possible URL (first only). Case insensitive.
matches = re.match( expr, msg, re.I)
url = matches[0]

Step 2 is easy enough --

import webbrowser

...

webbrowser.open(url)

Or, if you want to download the raw HTML:

import urllib2

...

response = urllib2.urlopen(url)
html = response.read()

If you need to download a file, you can use urllib to do the lifting.

import urllib

...

urllib.urlretrieve ("http://www.example.com/songs/mp3.mp3", "mp3.mp3")

As for that regex, let's break it out a bit:

(  (http)s?:\/\/((\.)?\w+)+(\/\S*)*  )  

First off, note that it's all in parentheses. Parentheses mean that it's a capture group, so we'll be able to get to it later.

(http)s?  

This will look for the string 'http', which may or may not have an 's' following it.

:\/\/

This will look for '://' -- the '\' are there because the '/' need to be escaped.

(\.)?\w+)+    #Grab everything between :// and /

This bit's fun. It'll look for a period (that is optional), followed by ''word characters'' -- letters or numbers, not punctuation or whitespace.
It will repeat this 1 or more times. In doing so, it'll grab strings like
amazon.com
amazon.co.uk

(\/\S*)*

This will grab any number of strings that begin with a '/', and may have words following them. This is things like
/
/home/
/foo.html?q=bar

UrhoKarila
  • 354
  • 7
  • 26