2

I am trying to read all the mails of a user using Gmail API filtering by inbox. But to read 16k+ mails it is taking around 2hrs. Is there any efficient way?

now = datetime.now()

timestamp = math.floor(datetime.timestamp(now))

count = 0
while True:
    results = service.users().messages().list(maxResults=50,userId='me',q='in:inbox before:{}'.format(timestamp)).execute()
    messages = results.get('messages')
    EmailRecepit=[]
    if messages==None:
        break
    for msg in messages:
        print("Count",count)
        count+=1
        # Get the message from its id

        txt = service.users().messages().get(userId='me', id=msg['id']).execute()
        try:
            # Get value of 'payload' from dictionary 'txt'
            payload = txt['payload']
            headers = payload['headers']
            attachment = payload['parts']
            for header in headers:  # getting the Sender
                if header['name'] == 'From':
                    msg_from = header['value']
                    name=sender_name(msg_from)#Sender Name Not email
            for a in attachment:
                if a.get('filename') != '' and len(a.get('filename')) != 0:
                    document = a.get('filename')
            if count % 50==0:
                timestamp = math.floor(datetime.timestamp(parser.parse(headers['Date']))

        except socket.error as error:
            pass
        except:
            pass

1 Answers1

0

The less requests you perform, the more efficient your code will be

Therefore, you should modify the request

service.users().messages().list(maxResults=50,userId='me',q='in:inbox before:{}'.format(timestamp)).execute()

by increasing the maximum number of results per request, e.g.

Specify: maxResults=500

However, be aware that using service.users().messages().get() on a large number of emails means a large number of requests, which inevitably will make you code slow.

Consider narrowing down the number of results retrieved with service.users().messages().list by expanding the q query and only retrieve emails that you are really interested in. For example: only emails with attachments, only emails from a certain sender or with a certain subject line.

If you must retrieve all 16k+ emails - the only way to speed up your code is using batch requests

Have a look e.g. here for a sample of how to implement a batch request for Gmail API in Python. Note that using batch requests can still result in excedding quota if you perform too many requests.

ziganotschka
  • 25,866
  • 2
  • 16
  • 33