2

I have an AWS Lambda Function 'A' with a SQS DeadLetterQueue configured. When the Lambda fails to process an event, this is correctly sent to the DLQ. Is there a way to re-process events that ended into a DLQ?

I found two solution, but they both have drawbacks:

  1. Create a new Lambda Function 'B' that reads from the SQS and then sends the events one by one to the previous Lambda 'A'. -> Here I have to write new code and deploy a new Function
  2. Trigger again Lambda 'A' just when an event arrives in the SQS -> This looks dangerous as I can incur in looping executions

My ideal solution should be re-processing on demand the discarded events with Lambda 'A', without creating a new Lambda 'B' from scratch. Is there a way to accomplish this?

Dos
  • 2,250
  • 1
  • 29
  • 39
  • It's not clear that you can automate this at all without potentially introducing an infinite loop. If a given message fails to be processed N times and ends up on the DLQ then there's presumably a number of reasons why that could have happened, including that there is something innate about the message that triggered an error in your Lambda function. You can't easily automate the recovery from that - it's going to require debugging and potential code re-deployment, or message editing and resubmission. – jarmod Jul 01 '20 at 17:08
  • Thanks for the suggestion! My use case is the following one: I have a Lambda that calls an external api and some invocations can fail for throttling reasons. I would like to find a way to reprocess these events later on demand (for example 1 hour later) without writing a lambda to do that. I have the events in the SQS, I have the Lambda to process them: what I am missing is a on-demand trigger or a similar service – Dos Jul 02 '20 at 16:32
  • 1
    You might consider an independent 'retry' queue. If your API calls are throttled, send the underlying message to that queue rather than returning it to its original queue. Use CloudWatch to run a new, simple Lambda on an hourly, or other, schedule. Have it pull from the retry queue using some business logic and send messages back to the original queue, for subsequent processing. Or some variant of that pattern, perhaps. Or just use the DLQ as the 'retry' queue. – jarmod Jul 02 '20 at 16:46
  • Then it seems that I have to write some custom code in any case, at least a new Lambda Function that re-processes the DLQ events. Thank you for the suggestion @jarmod , a CloudWatch scheduled on a periodic basis seems to be the best option, as I need other custom code to launch the lambda on-demand. – Dos Jul 09 '20 at 08:01

1 Answers1

3

Finally, I didn't find any solution from AWS to reprocess the DLQ events of a Lambda Function. Then I created my own custom Lambda Function (I hope that this will be helpful to other developers with same issue):

import boto3

lamb = boto3.client('lambda')
sqs = boto3.resource('sqs')
queue = sqs.get_queue_by_name(QueueName='my_dlq_name')


def lambda_handler(event, context):
    for _ in range(100):
        messages_to_delete = []
        for message in queue.receive_messages(MaxNumberOfMessages=10):
            payload_bytes_array = bytes(message.body, encoding='utf8')
            # print(payload_bytes_array)
            lamb.invoke(
                FunctionName='my_lambda_name',
                InvocationType="Event",  # Event = Invoke the function asynchronously.
                Payload=payload_bytes_array
            )

            # Add message to delete
            messages_to_delete.append({
                'Id': message.message_id,
                'ReceiptHandle': message.receipt_handle
            })

        # If you don't receive any notifications the messages_to_delete list will be empty
        if len(messages_to_delete) == 0:
            break
        # Delete messages to remove them from SQS queue handle any errors
        else:
            deleted = queue.delete_messages(Entries=messages_to_delete)
            print(deleted)

Part of the code is inspired by this post

Dos
  • 2,250
  • 1
  • 29
  • 39
  • Here is a complete approach, considering message replay with delay and a second DLQ to handle max number of retries: https://aws.amazon.com/blogs/compute/using-amazon-sqs-dead-letter-queues-to-replay-messages/ – juliano.net Aug 24 '21 at 14:12