5

I am planning to setup Apache Nifi on Kubernetes and make it to production. During my surfing I didn't find any one who potentially using this combination for production setup.

Is this good idea to choose this combination. Could you please share your thoughts/experience here about the same.

https://community.cloudera.com/t5/Support-Questions/NiFi-on-Kubernetes/td-p/203864

user2858005
  • 53
  • 1
  • 4
  • There are even helm charts for Apache NiFi however there is no official one yet. You may consider using them: https://github.com/AlexsJones/nifi ; https://github.com/cetic/helm-nifi – mario Oct 07 '19 at 13:40
  • Thank you @mario ! I also finally end up on the same page. Let me go through Helm-Nifi. – user2858005 Oct 07 '19 at 13:57

3 Answers3

1

As mentioned in the Comments, work has been done regarding Nifi on Kubernetes, but currently this is not generally available.

It is good to know that there will be dataflow offerings where Nifi and Kubernetes meet in some shape or form during the coming year.* So I would recommend to keep an eye out for this and discuss with your local contacts before trying to build it from scratch.

*Disclaimer: Though I am an employee of Cloudera, the main driving force behind Nifi, I am not qualified to make promises and this is purely my own view.

Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
  • 1
    Thank you @Dannis for your comments & opinion. We are planning to make things in production within one/two months and planning to add more Nifi's features in this project. So I am bit un-secure to try Kubernetes at this moment. Perhaps I can go for installation with Docker container for now and will move to Kubernates once officially released. I hope Nifi Docker container in AWS will work well for production setup. – user2858005 Oct 07 '19 at 19:21
1

I would like to invite you to try a Helm chart at https://github.com/Datenworks/apache-nifi-helm

We've been maintaining a 5-node Nifi cluster on GKE (Google Kubernetes Engine) in a production environment without major issues and performing pretty good. Please let me know if you find any issues on running this chart on your environment.

accbel
  • 144
  • 2
  • 5
0

Regarding any high volume set on k8s. Be sure to tune your linux kernel (primarily related to the Linux Connection Tracker (Contrack) service. You will also expect to see non-zero tcp timeouts, retries, out of window acks, et al. Depending on which container networking implementation is used, there may be additional configuration changes required.

I will assume you are using containerd and NOT using docker networking (except obviously the container(s) within a pod)

The issue applies to ANY heavy IO pod: kafka, NiFi, mySQL, PostGreSQL, you name it.

The incident increases when:

  • "high" volumes of cross pod (especially cross node) tcp connections occur
  • additional errors if you have large (megabyte or larger) messages

Be aware of any other components using either the Pod or VM tcp stack (e.g. PVC software supporting NiFi persisted storage)