0

I need to send some data to eventhub, but it is not going through because the size is too big. Is there a way to compress the data or some command that chunks up the data and joins it in event hub? I am using the following java-code to send:

EventData sendEvent = new EventData(payloadBytes);
EventHubClient ehClient = EventHubClient.createFromConnectionStringSync(connStr.toString());
ehClient.sendSync(sendEvent);

What are my options if the payloadBytes is too big?

Sébastien Dawans
  • 4,537
  • 1
  • 20
  • 29
Aparna
  • 835
  • 2
  • 23
  • 47
  • How do you send the data? Do you use SendBatchAsync? Or is a single event already too big? It does not make sense to use the Event Hub to send lots of big messages. – Peter Bons Jul 06 '16 at 14:06
  • I am using EventData sendEvent = new EventData(payloadBytes); EventHubClient ehClient = EventHubClient.createFromConnectionStringSync(connStr.toString()); ehClient.sendSync(sendEvent); Is there a method SendBatchAsync which I can use to send the message in batches? – Aparna Jul 07 '16 at 14:52
  • 3
    Yes there is, see https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.eventhubclient.sendbatchasync.aspx but it probably won't help you since a single payloadBytes is already larger than 256 KB. You will have to reduce the payload or split it over multiple events (which then could be send using SendBatchAsync to improve performance). There is no method to do that (compression or splitting) for you, you will have to write it yourself. – Peter Bons Jul 07 '16 at 15:01
  • How large is the data ? Is it very large >100mb or just a little over 256k < 1mb? @Peter: Max limit on aggregate of all message size in a batch - SendBatchAsync API is 256kb. The primary advantage of SendBatch API is to achieve transactional semantics & ordering in that batch. – Sreeram Garlapati Jul 22 '16 at 16:51

3 Answers3

1

Another approach would be to switch from basic to standard tier, which allows you to send events up to size 1MB. Microsoft docs: Quotas and limits - basic vs standard tiers.

lily_m
  • 371
  • 4
  • 5
1

You can try to compress the message you send. According to the docs You can edit the properties map inside the EventData :

var eventData = new EventData(....);
eventData.getProperties().put("Compression","GZip");

However, their solution didn't work for me and the messages size didnt decrease. Therefore I gziped the data by myself before adding it to the batch :

private EventData compressData(Data data) throws IOException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    GZIPOutputStream gos = new GZIPOutputStream(baos);
    gos.write(new DefaultJsonSerializer().serializeToBytes(data)); # worth to create the seriazlier outside of this functions scope..
    gos.flush();
    gos.close();
    return new EventData(baos.toByteArray());
}
JeyJ
  • 3,582
  • 4
  • 35
  • 83
0

Could you describe what is your use case and what sort of data are you going to publish, it helps find a right solution. Here are some common options I used in my projects.

Some concerns to keep in mind when designing system that need message broker or event streaming:

  • Event hub is designed for event streaming as it is described here. It scale well when you have huge amount of events.

  • 256KB limit per message is way more than what is usually needed to transfer events, which are usually text based.

In my project I had two different use cases when 256K was not enough, here is how I solved it:

  1. We need to publish internal event of a our Monolithic system out which later can be consumed by our Microservices outside. most of the time the messages were small but sometime they go slightly bigger than 256K and fix was easy we compress them before publish, and unzip them at the receiver. you can find a good sample here.

  2. In the second scenario messages were way bigger than 256K which compression was not enough, what I did was first create a blob with that content and then publish the event with a reference to that blob, then your event receiver can get the content from blob, no matter how big it is.

It should cover most of the use cases but let me know more about your problem if it is not useful for you.

  • Thank you for ur answer. My use case is collecting telemetry data from devices. The zip did not work because I had a stream analytics query too after the eventhub and then going to powerbi. Maybe eventhub is not the right tool..Am trying IOThub instead. – Aparna Jul 20 '17 at 11:24
  • IOThub uses an eventhub under the hood – Peter Bons Jul 20 '17 at 11:30
  • @Bahman Fakhr Sabahi, thank you for sharing your thoughts! Is there any example can be referred on your second approach? Do we need to compress the blob as well(e.g convert json into a parque)? Did you store the blob in blob storage? Or would it be a temp store? Will your functions remove the blob once it's done? – Drex Aug 09 '19 at 15:26