1

I want to build a basic email attachment processor on AWS. I followed this example (https://medium.com/caspertechteam/processing-email-attachments-with-aws-a35a1411a0c4), got everything on AWS talking to each other (SES, S3, Lambda).

I can read the content of a plain email (text only). Now, I switch to messages with attachments. I have this code (in AWS Lambda, Python 3.8)

import boto3
import email 

def lambda_handler(event, context):
    s3 = boto3.resource('s3')
    
    bucket = 'myBucket'
    key = 'inbound/cl2kjuud8oaqm6ihac491hs5s404krv6nkpq7781' # with csv attachment
    # key = 'inbound/kvqlcdlaqaqhq8i5a3m51nsr8ehehhsl227hu881' # no att
    
    obj = s3.Bucket(bucket).Object(key)
    body = obj.get()["Body"].read().decode('utf-8') # decode necessary, or error message
    
    msg = email.message_from_string(body)
    
    attachment = get_attachment(msg, 'text/plain')
    
# this code is copy&paste from tutorial above
def get_attachment(msg, content_type):
    """
    Moves through a tree of email Messages to find an attachment.
    :param msg: An email Message object containing an attachment in its Message tree
    :param content_type: The type of attachment that is being searched for
    :return: An email Message object containing base64 encoded contents (i.e. the attachment)
    """
    attachment = None
    msg_content_type = msg.get_content_type()

    if ((msg_content_type == content_type or msg_content_type == 'text/plain')
            and is_base64(msg.get_payload())):
        attachment = msg

    elif msg_content_type.startswith('multipart/'):
        for part in msg.get_payload():
            attachment = get_attachment(part, content_type)
            attachment_content_type = attachment.get_content_type()

            if (attachment and (attachment_content_type == content_type
                                or attachment_content_type == 'text/plain')
                    and is_base64(attachment.get_payload())):
                break
            else:
                attachment = None

    return attachment

I get this error

Response:
{
  "errorMessage": "'NoneType' object has no attribute 'get_content_type'",
  "errorType": "AttributeError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 24, in lambda_handler\n    attachment = get_attachment(msg, 'text/plain')\n",
    "  File \"/var/task/lambda_function.py\", line 58, in get_attachment\n    attachment_content_type = attachment.get_content_type()\n"
  ]
}

I tried it with Python 2.8 runtime, played with message_from_bytes(), nothing seems to work. Now I'm stuck. Can you share some advice?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Alex
  • 347
  • 3
  • 13

1 Answers1

0

Following is the python function that worked for me. Here, I'm extracting the attachments from SES email which was saved in S3 and posting them to my backend API using multipart/form-data encoding type.

import os
import boto3
import email
import requests
from email import policy
from email.parser import BytesParser

def lambda_handler(event, context):
    s3 = boto3.client('s3')

    recordId = event['Records'][0]['ses']['mail']['messageId']
    
    # Get email from the S3 bucket
    obj = s3.get_object(Bucket=os.getenv('SES_EMAIL_BUCKET'), Key=f'emails/{recordId}')
    raw_mail = obj['Body'].read()
    
    msg = BytesParser(policy=policy.default).parsebytes(raw_mail)
    
    for part in msg.iter_parts():
        content_disposition = part.get_content_disposition()
        if content_disposition and 'attachment' in content_disposition:
            file_name = part.get_filename()
            file_data = part.get_content()

            # Post attachments for processing
            url = os.getenv('MY_API_ENDPOINT')

            files = {
                'file': (file_name, file_data),
            }
            
            response = requests.post(url, files=files)

            # Add error handling based on the API's response
            print("API Response is")
            print(response)

Here, I've saved "SES_EMAIL_BUCKET" and "MY_API_ENDPOINT" in environment variables. Something to note here is that, AWS Lambda does not support "import requests" module yet. So, we need to install it locally and deploy to lambda (more information can be found here).

A.M.N.Bandara
  • 1,490
  • 15
  • 32