8

I am building a real time energy monitoring system, where the data are from sensors. There will be new data every second. The data used will be aggregated to be rendered as charts. I have looked into real time stream processing with big amounts of data, and it lead me to Apache Kafka.

Right now my web app is using Express js. I am using kafka-node library. Now currently, I manually insert new data through the command line as a producer. In my server code, I have set-up a consumer that listens to Topic1.

Server code:

var express = require('express');
var app = express();
var http = require('http').Server(app);
var bodyParser = require('body-parser');
var urlencodedParser = bodyParser.urlencoded({ extended: false });

var server = app.listen(3001, ()=>{
  console.log("app started on port 3001");
});

var io = require('socket.io').listen(server);


var kafka = require('kafka-node');

let Consumer = kafka.Consumer,
    client = new kafka.Client(),
    consumer = new Consumer(client,
      [
        {topic: 'Topic1', partition: 0}
      ],
      {
        autoCommit: false
      }
  );

app.use(express.static('public'));

consumer.on('message', (message) => {
  console.log(message.value);
  callSockets(io, message.value);
});

function callSockets(io, message){
  io.sockets.emit('update', message);
}

Client code:

<script type="text/javascript">
  var socket = io('http://localhost:3001');

  socket.on('connect', ()=>{
    console.log("connected");
  })

  socket.on('update', (data) =>{
    console.log(data);
  })
</script>

I am using socket.io to emit the message consumed in Kafka. Is there any other way to send Kafka data to client side? It just seems to me using socket.io is not too elegant here. Have I approached it the right way? Any recommendations welcome!

Thank you.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Adis Azhar
  • 1,022
  • 1
  • 18
  • 37
  • You could use Kafka Connect to send topic data into a database, then expose an aggregation REST API on that... This way, you are not going to be joining/filtering/aggregating data on the client-side. – OneCricketeer Aug 06 '18 at 02:36
  • 1
    @cricket_007 my concern is how to send data from kafka to web view (client) in real time. Would using a REST API be real time? Wouldn't that be executing a HTTP request, which client has to manually enter to get new data? – Adis Azhar Aug 06 '18 at 04:29
  • if you look at services such as DataDog or software such as Grafana or Kibana, then those pull a minimum every 5 seconds via HTTP (I believe), and can perform min,max,sum,avg functions over those larger time windows. I don't think your Socket.io client will be able to handle a rapid influx of thousands of Kafka messages at once; you need some buffer, whether that is Redis or some other database. That is my main point. As for web-application tools, you can look at [Server-sent Events (SSE)](https://hackernoon.com/supercharging-kafka-enable-realtime-web-streaming-by-adding-pushpin-fd62a9809d94) – OneCricketeer Aug 06 '18 at 07:11
  • @cricket_007 what will the buffer do in my scenario? Sorry I find it hard to imagine without some explanations. – Adis Azhar Aug 06 '18 at 20:16
  • Well, I don't know how many messages you are sending, so maybe it isn't necessary. But I could imagine a scenario in which you are receiving messages faster than you are able to process them. In which case, you dump the raw messages out to a durable store rather than trying to plot them immediately. – OneCricketeer Aug 06 '18 at 20:18
  • Digging into the same thing. My idea is - once i land my data from Kafka into MySQL materialized view (CQRS), on success i do emit new Kafka event. Then i can have new Kafka Consumer (Golang) which waits for those specific events (MySQL inserted successfully) and once my, let's call it WebSocket Service receives this event i can make new request to MySQL for specific record and broadcast it over web sockets. I am noob in programming, and not sure will it work, but it seems legit for me. Will try this today/tomorrow and will update this tread. – Dzintars Dec 22 '18 at 16:48
  • @Dzintars how did you go? – Adis Azhar Dec 23 '18 at 14:54
  • @adis Got it working. But i did this in Golang. This topic was about NodeJS. Basically made new API endpoint, in handler upgraded connections to WSS and spin up new Kafka Consumer. Decoded kafka message from binary to JSON and write it into connection. At least it does what i want. – Dzintars Jan 14 '19 at 14:32
  • @adis forgot to tell that i am using Channels to send messages from Consumer to Handler. – Dzintars Jan 15 '19 at 19:00
  • 1
    hii,is there any solution on this topic – Humesh Sindpure Apr 20 '20 at 12:06
  • @HumeshSindpure are you trying to do the same thing? My code above works okay but doesn't do well for large streams of data e.g sensor readings. – Adis Azhar Apr 20 '20 at 12:35
  • @AdisAzhar do you have any other solution that you feel to be more elegant\better in performance than the above? **My requirement is ditto the same as yours**. – jAntoni Aug 27 '20 at 18:55
  • @jAntoni what perormance hit are you experiencing? If you have data streams I suggest to use a buffer between Kafka and the client app. Redis should do the trick... Or you could try finding a Kafka to websocket connector, it may help. – Adis Azhar Aug 28 '20 at 10:27
  • @AdisAzhar I have just started building the app, was considering on which way to start\go about. I have now implemented exactly like how you have given here. and also successful in getting data from kafla to page console. Nodejs is entirely new for me, Can you please help me to get the data on the page that which we now get on the page console. Or shall I create a new Question here for this. Kindly help me. – jAntoni Sep 08 '20 at 09:34

1 Answers1

5

Without speaking to your solution specifically, I think you're doing the right thing creating a dedicated API for the client to read data from. The client needs to be able to receive updates from somewhere. The only "faster" way would be to allow the client to directly pull from Kafka which would increase risk substantially as Kafka is not designed to be publicly accessible.

So your solution with node.js is actually rather elegant compared to doing the same thing in C# or Java which would require a lot more code.

Richard Fuller
  • 464
  • 3
  • 12