0

I am working in a distributed environment.. I have a central machine which needs to monitor some 100 machines. So I need to use ELK stack and keep monitoring the data.

Since elasticsearch, logstash,kibana and filebeat are independent softwares, i want to know where should i ideally place them in my distributed environment.

My approach was to keep kibana, elasticsearch in the central node and keep logstash and filebeat at individual nodes.

Logstash will send data to central node's elasticsearch search which kibana displays it.

Please let me know if this design is right.

vinod hy
  • 827
  • 2
  • 14
  • 26

1 Answers1

0

Your design is not that bad but if you install elasticsearch on only one server, with time you will face the problem of availability.

You can do this:

  1. Install filebeat and logstash on all the nodes.
  2. Install elasticsearch as a cluster. That way if one node of elasticsearch goes down, another node can easily take over.
  3. Install Kibana on the central node.

NB:

  • Make sure you configure filebeat to point to more than one logstash server. By so doing, if one logstash fails, filebeat can still ships logs to another server.
  • Also make sure your configuration of logstash points to all the node.data of your elasticsearch cluster.

You can also go further by installing kibana on says 3 nodes and attaching a load balancer to it. That way your load balancer will choose the instance of kibana that is healthy and display it.

UPDATE

With elasticsearch configured, we can configure logstash as follows:

output {
    elasticsearch{
        hosts => ["http://123.456.789.1:9200","http://123.456.789.2:9200"]
        index => "indexname"
    }
}

You don't need to add stdout { codec => rubydebug } in your configuration.

Hope this helps.

berrytchaks
  • 839
  • 10
  • 18
  • Thanks for the info. I am trying to explore on elasticsearch clustering with 2 machines. I hope that i have made the necessary changes. :) .. Please review the same. These are the elasticsearch.yml changes i have done on my machine. **My Machine:(123.456.789.1)** cluster.name: ControlElasticSearch node.name: Node1 node.master: true node.data: true network.host: 123.456.789.1 discovery.zen.ping.unicast.hosts: ["123.456.789.1", "123.456.789.2"] – vinod hy May 25 '17 at 06:55
  • **Other Machine:(123.456.789.2)** cluster.name: ControlElasticSearch node.name: Node2 node.master: false node.data: true network.host: ***.***.***.2 discovery.zen.ping.unicast.hosts: ["123.456.789.1", "123.456.789.2"] I was able to perform, 123.456.789.1:9200 123.456.789.2:9200 on both the machines. – vinod hy May 25 '17 at 06:55
  • I also tried the command, curl -XGET "http://132.186.102.84:9200/_cluster/state?pretty" – vinod hy May 25 '17 at 06:56
  • i get below o/p, "cluster_name" : "ControlElasticSearch", "version" : 20, "state_uuid" : "6_NZDrKkTMKbCR2Vh4r_9Q", "master_node" : "7XxnVX8nRa-QLBBxmN7AnQ", "blocks" : { }, "nodes" : { "7XxnVX8nRa-QLBBxmN7AnQ" : { "name" : "Node1", "ephemeral_id" : "Wte2C-KUQVWLKG5ba_wHBA", "transport_address" : "123.456.789.1:9300", "attributes" : { } }, "LKhUGAEXTzO-ZcglRYC_yQ" : { "name" : "Node2", "ephemeral_id" : "ChRBXvl8Tri8YxffDRgxyA", "transport_address" : "123.456.789.2:9300", "attributes" : { } } }, – vinod hy May 25 '17 at 06:56
  • Please let me know if everything is fine till now. I want to know now how do i communicate from my logstash to elastic search, My previous configuration was, output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug } } – vinod hy May 25 '17 at 06:56
  • Yes, you are good with elasticsearch configuration. Please see the update for the output configuration of logstash. – berrytchaks May 25 '17 at 10:14
  • 1
    Hi, I went through the below link, https://www.elastic.co/guide/en/elasticsearch/guide/current/_an_empty_cluster.html. The last paragraph says that, as a user we can talk to any node in the cluster and every node knows where each document lives and can forward our request directly to the nodes that hold the data we are interested in. – vinod hy May 25 '17 at 10:43
  • So from my logstash, u can either forward to 123.456.789.1:9200 or 123.456.789.2:9200 as below, output { elasticsearch { hosts => ["123.456.789.1:9200"] } stdout { codec => rubydebug } } output { elasticsearch { hosts => ["123.456.789.2:9200"] } stdout { codec => rubydebug } } Not necessary to add both the ips in hosts. – vinod hy May 25 '17 at 10:43
  • Ok, suppose you choose to used `output { elasticsearch { hosts => ["123.456.789.1:9200"] } stdout { codec => rubydebug } }`. If for one or another reason the server `123.456.789.1` goes down, event wasn't be reaching elasticsearch any more. How will you address that? – berrytchaks May 25 '17 at 11:02
  • Hmm.. You are right.. Makes sense.. But two doubts i have.. 1.Say suppose if both are online, will my data be sent to both the nodes??.. 2.And what elasticsearch url should i give in kibana?? – vinod hy May 25 '17 at 12:37
  • `1.Say suppose if both are online, will my data be sent to both the nodes??` No, you data will be send to the node that is healthy say node1 and as you have install elasticsearch as a cluster, the other will know that some data have been sent to node1. `2.And what elasticsearch url should i give in kibana??` Oh! I completely forget that part. As a work around, can just talk the master of your cluster. For availability, you will have to proxy elasticsearch. See https://github.com/elastic/kibana/issues/2260 and https://discuss.elastic.co/t/configure-kibana-for-multiple-es-servers-nodes/2431/2. – berrytchaks May 25 '17 at 13:11
  • I have few more doubts, 1.As per the link shared by you,the cluster should have atleast 3 master eligible nodes. But in the example which i have shared above, i have made 1 master eligible node and 1 dedicated data node. Is it ok? 2.what will happen if there is no data node at all? say suppose in the 2 node cluster scenario which i explained, i make one node as node.master = true and node.data = false. and other node as node.master = false and node.data = false. – vinod hy May 25 '17 at 17:21
  • one point i would like to confirm here.. in the initial description, i mentioned that in my distributed environment there will be 1 central node which will monitor 100 machines. When you suggested me to have a elasicsearch cluster, i am assuming that i will have new nodes dedicated to the cluster apart from 1 central node and 100 nodes which i mentioned. Why i am assuming because, i am not sure about the growth of 100 machines. it can be 50 machines only or it can grow upto 500 machines also. – vinod hy May 26 '17 at 11:25
  • I was thinking to have cluster of 100 machines which are being monitored by 1 central node which is a bad idea. Please correct me if i am wrong. **Instead of that i will have seperate nodes dedicated for elasticsearch cluster.** – vinod hy May 26 '17 at 11:26
  • Yes. that is exactly what I was saying. – berrytchaks May 26 '17 at 12:02
  • I want to setup a cluster with 3 nodes which is a basic minimum count required to form the cluster. I have the nodes configured as following, **node1 :** master:true data:true **node2:** master:true data:true **node3:** master:true data:true – vinod hy May 27 '17 at 10:41
  • **[My Question 1]** Have set minimum master node value as 2 as per the calculation 3/2 + 1 = 2.With this we avoided split brain issue. I have also satisfied the thumb rule of having 3 mininum master eligible node here. Am i right here. **[Your answer]** **[My Question 2]** So now, the data is distributed to all the nodes since all are data nodes with master eligible. **[Your answer]** – vinod hy May 27 '17 at 10:41
  • **[My Question 3]** The thumb rule of having minimum 3 master eligible nodes is madtatory because if u have only 2 nodes in ur machine, **issue 1:** If the minimum master node value is set to 1, then in case of network connectivity issue betwee the nodes, both the nodes will declare themselves as master which is a serious problem of split brain. **issue 2:** If the minimum master node value is set to 1, both cant become masters and the cluster becomes inactive due to non-availabilty of the master. – vinod hy May 27 '17 at 10:44
  • These are the 2 issues which i found for mandating for having atleast 3 master eligible nodes per cluster. The conclusion is there can be no cluster with 2 nodes. Atleast 3 nodes are mandatory to form a cluster. **[Your answer]** – vinod hy May 27 '17 at 10:44
  • **[My Question 4]** Extending to point 1, say suppose in my cluster of 3 nodes, i have only 2 master eligible nodes and other 1 is a client node. **node1 :** master:true data:true **node2:** master:true data:true **node3:** master:false data:false – vinod hy May 27 '17 at 10:45
  • Minimum master count here is 2. **Issue 1:** Say suppose node 2/node 1 goes out of network, cluster will become inactive as master formation will not be possible. But if node 3 goes out of network, then there is no issue. Node 3 cannot become master. Among node2 & node3, someone will be choosen as master through election process and the cluster continues to remain in healthy state. Am i Right? **[Your answer]** – vinod hy May 27 '17 at 10:45
  • **[Question 5]** I want to make a conclusion here. With only 3 nodes , i cannot have dedicated master and dedicated data nodes.. All three has to be master eligible data nodes..Is the statement right? **[Your Answer]** – vinod hy May 27 '17 at 10:45
  • Can you please reply on my questions.. @berrytchaks – vinod hy May 29 '17 at 07:35
  • Sorry, I'm not feeling fine these days. But please give some time let me think about your questions and get back to you. – berrytchaks May 29 '17 at 14:30
  • Hi berrytchaks, I have decided not to go ahead with elasticsearch cluster. I will be running single instance of elasticsearch in central machine. So now my design which i mentioned in the question, will have kibana and elasticsearch running on central machine. logstash and filebeat will be running on other nodes. My question is, if i move logstash also to central machine. only filebeat will be running on nodes which will communicate with logstash on central machine. Any issues here? – vinod hy Jun 12 '17 at 09:00
  • I think if i move logstash to central machine, i wont be able to add node specific filters. It will end up being a common filter for all the nodes. I think having logstash running on individual nodes along with filebeat will be the good idea. Kibana and elasticsearch can continue running on central machine. Am i right here? – vinod hy Jun 12 '17 at 09:03
  • Hi Vinod. My question is, if i move logstash also to central machine. only filebeat will be running on nodes which will communicate with logstash on central machine. Any issues here? No issue that will work fine. – berrytchaks Jun 12 '17 at 09:28
  • But, when we want to add filter to logstash, the filter will be common to all the nodes. if i have logstash running independently, it can have node specific filters .. Having logstash running on individual nodes gives us more freedom to write grok filters.. right? – vinod hy Jun 12 '17 at 09:35
  • Answer to the second question: You can also run logstash and filebeat on individual. – berrytchaks Jun 12 '17 at 09:35
  • Now, my next task is to establish SSL connection between logstash and elasticsearch. I am using x-pack for the same. can we achive SSL connection using x-pack ? Please give your inputs – vinod hy Jun 12 '17 at 09:47
  • But, when we want to add filter to logstash, the filter will be common to all the nodes. if i have logstash running independently, it can have node specific filters .. Having logstash running on individual nodes gives us more freedom to write grok filters.. right? Yes – berrytchaks Jun 12 '17 at 09:51
  • Now, my next task is to establish SSL connection between logstash and elasticsearch. I am using x-pack for the same. can we achive SSL connection using x-pack ? Please give your inputs. Sorry but I do have much experience with x-pack, for security, I'm using `searchguard.ssl` – berrytchaks Jun 12 '17 at 09:54
  • Ohk. Any analysis u did on why you are using searchguard over x-pack. I am new to this. I found x-pack in the elastic website. So i thought of using it – vinod hy Jun 12 '17 at 09:57
  • Oh sorry my bad, we went in for x-pack but that is really expensive. I think it is 1000$ per year. You can check that [here](https://www.elastic.co/subscriptions). That is why, we went in for `searchguard` which is open source. – berrytchaks Jun 12 '17 at 10:19
  • Is it not X-pack is also open source? Available as plugin.. I have installed it as plugin in elasticsearch. – vinod hy Jun 12 '17 at 10:42
  • Please read [this](https://www.elastic.co/guide/en/x-pack/current/license-management.html) – berrytchaks Jun 12 '17 at 10:56
  • Hi, i went through the link. its for 30 days trial only. i dont want to go for it. Even i would like to go for searchguard. I will open a separate topic for this and post the link here. Please give your inputs there – vinod hy Jun 12 '17 at 18:23
  • Ok, do that and let me know. – berrytchaks Jun 12 '17 at 18:52
  • Hi berrytchaks.. Please find the link below for searchguard discussion. https://stackoverflow.com/questions/44517107/configuring-searchguard-on-elk-for-security – vinod hy Jun 13 '17 at 09:14
  • please help me in using the searchguard in the link i shared above. – vinod hy Jun 26 '17 at 09:10