2

Have a ArrayList containing 80 to 100 records trying to stream and send each individual record(POJO ,not entire list) to Kafka topic (event hub) . Scheduled a cron job like every hour to send these records(POJO) to event hub.

Able to see messages being sent to eventhub ,but after 3 to 4 successful run getting following exception (which includes several messages being sent and several failing with below exception)

    Expiring 14 record(s) for eventhubname: 30125  ms has passed since batch creation plus linger time

Following is the config for Producer used,

    props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    props.put(ProducerConfig.ACKS_CONFIG, "1");
    props.put(ProducerConfig.RETRIES_CONFIG, "3");

Message Retention period - 7 Partition - 6 using spring Kafka(2.2.3) to send the events method marked as @Async where kafka send is written

    @Async
    protected void send() {
       kafkatemplate.send(record);
    }

Expected - No exception to be thrown from kafka Actual - org.apache.kafka.common.errors.TimeoutException is been thrown

Prakash_se7en
  • 88
  • 2
  • 3
  • 16
  • The error is saying you've not yet filled the batch size of the producer (the records aren't sent immediately). You could either reduce the batch size in the producer configs or periodically flush the producer on your own – OneCricketeer Sep 19 '19 at 13:12
  • many thanks for the reply @cricket_007 what desired size you would recommend as the default size is 16384 – Prakash_se7en Sep 19 '19 at 13:50
  • Are your 80-100 records in total larger than 1.6 MB? – OneCricketeer Sep 19 '19 at 14:45
  • it will be close to 150-200 kb @cricket_007 – Prakash_se7en Sep 19 '19 at 14:53
  • Oops, I meant 1.6 Kb above. Okay, so on the low end, `150000/16384` is about 9 total batches, by default, with some remainder. You'll need to adjust the value such that you won't have data remaining in a un-sent batch – OneCricketeer Sep 19 '19 at 15:21

2 Answers2

7

Prakash - we have seen a number of issues where spiky producer patterns see batch timeout.

The problem here is that the producer has two TCP connections that can go idle for > 4 mins - at that point, Azure load balancers close out the idle connections. The Kafka client is unaware that the connections have been closed so it attempts to send a batch on a dead connection, which times out, at which point retry kicks in.

  • Set connections.max.idle.ms to < 4mins – this allows Kafka client’s network client layer to gracefully handle connection close for the producer’s message-sending TCP connection
  • Set metadata.max.age.ms to < 4mins – this is effectively a keep-alive for the producer metadata TCP connection

Feel free to reach out to the EH product team on Github, we are fairly good about responding to issues - https://github.com/Azure/azure-event-hubs-for-kafka

  • 1
    Additional configurations can be found here - https://github.com/Azure/azure-event-hubs-for-kafka/blob/master/CONFIGURATION.md – Arthur Erlendsson Dec 06 '19 at 21:34
  • thanks for the reply . one additional info. Will kafka producer do the retry(if retries set in kafka producer config >3) in case of "xxx ms has passed since batch creation plus linger time" ? – Prakash_se7en Apr 05 '22 at 15:46
0

This exception indicates you are queueing records at a faster rate than they can be sent. Once a record is added a batch, there is a time limit for sending that batch to ensure it has been sent within a specified duration. This is controlled by the Producer configuration parameter, request.timeout.ms. If the batch has been queued longer than the timeout limit, the exception will be thrown. Records in that batch will be removed from the send queue.

Please check the below for similar issue, this might help better.

Kafka producer TimeoutException: Expiring 1 record(s)

you can also check this link

when-does-the-apache-kafka-client-throw-a-batch-expired-exception/34794261#34794261 for reason more details about batch expired exception.

Also implement proper retry policy.

Note this does not account any network issues scanner side. With network issues you will not be able to send to either hub.

Hope it helps.

Mohit Verma
  • 5,140
  • 2
  • 12
  • 27
  • still i can see the same exception Expiring 7 record(s) for eventhubname: 60125 ms has passed since batch creation plus linger time – Prakash_se7en Sep 24 '19 at 14:47