8

There are limits on the size of published SQS messages. A single message can not be larger than 256 KB and a batch of messages (max of 10) also cannot exceed 256 KB. When adding messages to a collection to later publish to an SQS queue, how can I keep track of the message sizes to ensure my request stays under the limit?

I have looked at methods to get the size of an object and I know that the IDs of the failed messages will be available to me after calling sendMessageBatch(). I don't feel like either of these are very good solutions because the size of the object itself omits the overhead data from the message (that I assume also counts), and I would really like to not have to manage failed messages simply because the batch was too large. I really don't expect my messages to ever be that large but you never know.

Code snippet to batch and send:

List<SendMessageBatchRequestEntry> entries = new LinkedList<>();

And then in a loop:

SendMessageBatchRequestEntry entry = new SendMessageBatchRequestEntry();

entry.setMessageBody(gson.toJson(message));
entry.setMessageAttributes(attributes);             
entry.setId(messageId);

// How to make sure `entry` is not larger than 256 KB, and how to
// make sure adding this entry won't cause the batch to exeed 256 KB?

entries.add(entry);

And lastly:

client.sendMessageBatch(queueUrl, entries);
Community
  • 1
  • 1
jzonthemtn
  • 3,344
  • 1
  • 21
  • 30
  • I don't think there's a clearly documented answer for a proper definition of "message size." There are message attribute keys and values that contribute to the message size, and presumably *some* overhead, not to mention the *typed* (?!) message attributes that [may or may not](http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/SQSMessageAttributes.html#sqs-attrib-md5) burn 12 extra bytes each, to store the length of their name, the length of their data type name, and the length of their value (4+4+4). – Michael - sqlbot Nov 08 '16 at 22:51
  • Time permitting, and assuming you don't get a straightforward answer, I'd like to write some test cases that push the edge of the message size limit (including single and metadata multiple metadata keys, types, and, values) to try to figure it out. Time permitting. – Michael - sqlbot Nov 09 '16 at 22:56
  • Is it still true that there is a max batch size of 10? I'm not seeing that in current aws docs https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sqs.html#SQS.Client.send_message_batch – Michael Smith May 25 '22 at 19:07

2 Answers2

6

According to AWS SQS Batch notes,

The total size of all messages that you send in a single SendMessageBatch call can't exceed 262,144 bytes (256 KB).

According to this the SendBatchMessage call should not exceed 256KB. And when looking at your code snippet, you have used a List object named entries. All list objects are serializable in Java. See this answer.

You can get a good estimation of the size of the object by serializing it. The code for it is given below.

ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(entries);
oos.close();
System.out.println(baos.size());

With this you can get an approximate size in bytes, of the entries LinkedList. After that you can simply check whether the entries list exceeds 256Kb, after addding a new entry object. If so discard the entry object and call SendBatchMessage with previous entries object to SQS. The code snippet is given below.

    List<SendMessageBatchRequestEntry> entries = new LinkedList<>();
    SendMessageBatchRequestEntry entry = new SendMessageBatchRequestEntry();

    entry.setMessageBody(gson.toJson(message));
    entry.setMessageAttributes(attributes);
    entry.setId(messageId);

    entries.add(entry);
    if ( getObjectSize(entries) > 262144) {
      entries.remove(entry);
      client.sendMessageBatch(queueUrl, entries);
    } else {
      //add another entry to the entries List 
    }

  private static int getObjectSize(List object) throws IOException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectOutputStream oos = new ObjectOutputStream(baos);
    oos.writeObject(object);
    oos.close();
    return baos.size();
  }
Keet Sugathadasa
  • 11,595
  • 6
  • 65
  • 80
  • Repeatedly serializing the batch to check size isn't just a non-starter, it is slower than sending the individual messages. If you can't control message size then don't batch them. – John Bryant Jul 16 '20 at 02:08
  • @JohnBryant I don't understand the reason for your downvote. The question states `how can I keep track of the message sizes to ensure my request stays under the limit?`, and the answer is for that. The reason for batching is to avoid rate limits. Most message sizes cannot be controlled, and if you could, you only need a simple for loop. Could you please read the question again and see whether my answer is helpful? I think your comment should go on the question. – Keet Sugathadasa Jul 17 '20 at 12:01
  • 1
    Sorry for being so grumpy but repeatedly serializing a list of objects to check the length causes more problems. Especially in a high volume message application. See a quick attempt below. No compiling code so it may need some work. – John Bryant Jul 17 '20 at 20:37
  • is there any limit on number of messages in one batch for pushing in batch to AWS SQS ? – Vishal Jamdade Nov 27 '21 at 09:16
  • @VishalJamdade each message can be at most 256KB, but also with batching the sum of the sizes of all messages in the batch should not exceed 256KB; so you can't just say it's X number of messages. It's probably just better to use SQS' extended library then you can send large message sizes and not worry about checking the size. – Roe Aug 11 '22 at 21:51
-2

Here is a more simplistic way instead of repeatedly wasting time serializing the entire list of messages. Instead, get a ball-park value from the body, account for the attributes and ids, I am allowing here for about 6k overhead in the maxBatchMessageSize but if I can calculate the size of the attributes I would get the number closer;

int sqsBatchSize = 10;
int maxBatchMessageSize = 250000;
int currentBatchSize = 0;

private void sendMessage(Object message) {
    String msgString = gson.toJson(message);
    int size = msgString.length() + messageId.length() + "whatever else if a factor".length();
    if (currentBatchSize + size > maxBatchMessageSize || sqsBatchSize == entries.size()) {
        client.sendMessageBatch(queueUrl, entries);
        entries.clear();
        currentBatchSize = 0;
    }
    SendMessageBatchRequestEntry entry = new SendMessageBatchRequestEntry();
    entry.setMessageBody(gson.toJson(message));
    entry.setMessageAttributes(attributes);
    entry.setId(messageId);
    entries.add(entry);
    currentBatchSize += size;
}

public void close() {
    if(!entries.isEmpty()) {
        client.sendMessageBatch(queueUrl, entries);
    }
}

*code above is pseudo-code but the concept is valid.

John Bryant
  • 327
  • 3
  • 11
  • Be careful with the code here as it will require a synchronize statement should it be called by multiple threads. – John Bryant Jul 17 '20 at 20:48