1

I am not an experienced used in the Amazon SQS service. I have to read messages from a queue that I do not own and process them making a small database with some of the information.

Up until now I just had some code that would read all the messages in the queue and process them. The script was running periodically.

However, recently I observed that the amount of messages in the queue has suddenly become very large. When I took a 10000 sample messages I observed that around 6000 where duplicates.

I am puzzled by this sudden change in behavior (up until now I did not observe duplicate messages). The queue never seems to run out.

This is the code I use to read all the messages from the queue.

conn = boto.sqs.connect_to_region(
    'myregions',
    aws_access_key_id='myacceskey',
    aws_secret_access_key='secretAccesKey')
q = boto.sqs.queue.Queue(connection=conn, url='outputQueue')

rs = q.get_messages(10)
all_messages = []
while len(rs) > 0:
    all_messages.extend(rs)
    print (len(all_messages))
    rs = q.get_messages(10)

Can anybody explain why I am getting duplicated messages suddenly? I do not have permissions to see how large the queue is, how can I get all messages in it? Am I doing it right?

Usobi
  • 1,816
  • 4
  • 18
  • 25

1 Answers1

2

After processing a message from the queue you need to send back a notification that the message has been processed and it should be deleted. Failure to do so will just mean the message sits in the queue and is re-fetched until such a time that it reaches the fetch limit and is sent to the dead letter queue or it expires.

SQS does not guarantee uniqueness and you can get duplicates, you can set a Visibility Timeout to prevent the message being read for a period of time after it has been retrieved e.g. a minute or so to give you time to process the message and delete it from the queue. This should avoid duplicates.

As for deleting the message iterate over the messages, process them and then run either...

conn.delete_message(q, message)

or

q.delete_message(message)

Simon McClive
  • 2,516
  • 3
  • 17
  • 13
  • Ok, I will. Any idea why the code has been working nicely up until now? – Usobi Feb 02 '16 at 22:55
  • Define working nicely. How long have you been reading from the queue for without dupes? – Simon McClive Feb 02 '16 at 23:02
  • Hooo.. now I understand. The messages have a visibility timeout. So if there are not many of them I can read them in very quickly and the for loop will indeed conclude!. That's why it has been running ok up until now. Suddenly some more messages appear and the loop is not fast enough. Then the duplicates come and the process becomes the never ending story! Could that reasoning be ok? – Usobi Feb 02 '16 at 23:20
  • That sounds perfectly reasonable. – Simon McClive Feb 02 '16 at 23:22
  • How do you send a "notification" that the message has been processed? – Arash Outadi Jul 16 '21 at 21:39