-2

I am currently choosing between Kafka Streams or Logstash for real-time log collect, transformation and enrichment and finally send to Elasticsearch. The logs comes from different IT network devices such as firewalls, switches, access-points etc.

Since both Kafka Streams and Logstash have almost similar functionalities, is there benefits choose 1 over another? (Performance? Easy to deploy?)

Thanks

Ong Yu Feng
  • 21
  • 1
  • 3
  • It depends on your use case. Check [this](https://stackoverflow.com/questions/40864312/how-logstash-is-different-than-kafka) out. – el-chico Feb 05 '21 at 03:26
  • 1
    Just to be sure that I get your question right: Your logs are still located at the devices right? I would guess KafkaStreams is the wrong thing to pick up: It is a client library which could be used in your own application to perform stream operation. There is no logic included here to pick up logfiles from somewhere. That is something you have to build. Maybe Logstash has this already build in but as far as I know it has to run on the device where the logfiles are (but just guessing here) – TobiSH Feb 05 '21 at 06:27

1 Answers1

1

Kafka Streams and Logstash are two completely different things

Kafka Streams is a client library that you can use to write an application to stream and process data stored in Kafka Brokers, you need to write your own application in Java.

Logstash is an ETL tool that you can use to extract/receive data from multiple sources, process this data using a wide range of filters and send it to different outputs, like elasticsearch, file, s3, kafka and many others.

It is very common to use Logstash and Kafka together, which Kafka working as a message queue for the messages that logstash will consume and process, you have shippers like Filebeat sending data to Kafka Brokers and then you use Logstash to consume this data.

You can build your own applications in Java using the Kafka Streams library to collect, process and ship the data to Elasticsearch, but this will be very complex in comparison with using the tools of the stack, Filebeat to collect logs, Logstash to receive/process, Elasticsearch to store.

leandrojmp
  • 7,082
  • 2
  • 19
  • 24