I'm try to analyse my 25k+ emails similar to the post here: http://beneathdata.com/how-to/email-behavior-analysis/
While the mentioned script used IMAP, I'm trying to implement this using the Gmail API for improved security. I'm using Python (and Pandas for data analysis) but the question applies more generally to use of the Gmail API.
From the docs, I'm able to read emails in using:
msgs = service.users().messages().list(userId='me', maxResults=500).execute()
and then access the data using a loop:
for msg in msgs['messages']:
m_id = msg['id'] # get id of individual message
message = service.users().messages().get(userId='me', id=m_id).execute()
payload = message['payload']
header = payload['headers']
for item in header:
if item['name'] == 'Date':
date = item['value']
** DATA STORAGE FUNCTIONS ETC **
but this is clearly very slow. In addition to looping over every message, I have to call the list() API call many times to cycle through all emails.
Is there a higher performance way to do this? e.g. to ask the API to only return the data rather than all unwanted message information.
Thanks.