2

ElasticSearch 6.2.2 on Linux Ubuntu 16.04.3 VM in Azure. It had been up and running fine and then after I rebooted the machine a few days ago I could not get the ElasticSearch service to start at all. Issue was shared and solved here: (ElasticSearch Fails to Start on Ubuntu 16.04.3 - status=1 Failure) by increasing the heap size in the jvm.options file.

Now I have the ElasticSearch service running but I cannot ping it at all. I have tried to ping it from both inside the VM (as localhost:9200) and from outside, (similar to how I make calls to our other ES boxes, and do so successfully) but I'm told Could Not Get Any Reponse (Postman syntax).

The part that is making this impossible to diagnose is nothing is getting written to the ElasticSearch logs! The last time anything was written to any log at /var/log/elasticsearch was before I rebooted the machine a couple days ago.

I have checked the settings in elasticsearch.yml and all seems to be in-line with the elasticsearch.yml that's on a different box of ours in a different location which runs another ElasticSearch instance of ours without any issue.

EDIT: per request - the elasticsearch.yml file from the box that is NOT working correctly is here: http://s000.tinyupload.com/index.php?file_id=72318548245343478927 For comparison purposes, the elasticsearch.yml file from the box that IS working correctly is here: http://s000.tinyupload.com/index.php?file_id=20127693354114612595 Please note that the one that IS working correctly has 3 nodes whereas the one that is not working has only one node, so there will be some slight differences between the yml files because of this.

Stpete111
  • 3,109
  • 4
  • 34
  • 74

3 Answers3

1
  1. Check if path.logs: /var/log/elasticsearch is defined in elasticsearch.yml. Add this line if not present.
  2. Check whether the user has permission to write into /var/log/elasticsearch. Change the permission of the files. sudo chmod 777 /var/log/elasticsearch/* and sudo chmod 777 /var/log/elasticsearch
  3. Open /etc/init.d/elasticsearch and check whether ES_PATH_CONF is defined as ES_PATH_CONF="/etc/elasticsearch"
  4. You may try commenting the following lines on log4j2.properties under /etc/elasticsearch. logger.xpack_security_audit_logfile.name = org.elasticsearch.xpack.security.audit.logfile.LoggingAuditTrail logger.xpack_security_audit_logfile.level = info logger.xpack_security_audit_logfile.appenderRef.audit_rolling.ref = audit_rolling logger.xpack_security_audit_logfile.additivity = false
  5. Use netstat -nultp | grep 9200 and check whether the port is being listened to.
ArnavRay
  • 321
  • 1
  • 9
  • 1. Yes it is. 2. I ran both these commands. I ran them from root. Should I also run as the main user? 3. Yes it is. 4.These lines do not exist in the mentioned file. 5. I received no response when running this command. Please advise at your most convenient and thanks for your continued help! – Stpete111 Sep 19 '18 at 21:03
  • By the way, this box is configured to listen on 9200 via the Azure inbound net rules - so it should definitely be listening on 9200. – Stpete111 Sep 19 '18 at 21:08
  • 1
    Inbound rules are for the firewall. If netstat did not give a response it means that the ES service has not yet started on 9200. If you have large amount of data it might take a long time to start, even 4 hrs. If sudo service elasticsearch status returns active let it run for a few hours. Keep checking the netstat command to see if it had binded to port 9200 – ArnavRay Sep 19 '18 at 21:12
  • 1
    Can you please share your elasticsearch.yml file in the meantime. – ArnavRay Sep 19 '18 at 21:13
  • This ES instance has no data in it. There are a couple indexes but no data in them yet. I have attached the yml files in the original post. – Stpete111 Sep 19 '18 at 21:20
  • ES definitely is NOT binded to port 9200. I keep running that command, and nothing. This surely is the problem. How to do I fix this? – Stpete111 Sep 19 '18 at 22:59
  • 1
    @Stpete111 The yml file links have expired. Shows 404. – ArnavRay Sep 20 '18 at 04:13
  • sorry about that, new links posted. – Stpete111 Sep 20 '18 at 11:51
  • Still no logs being written for the last 2 days. This is so very odd. – Stpete111 Sep 20 '18 at 12:11
  • 1
    This is an extremely weird problem. Can you put elasticsearch.yml to its default settings and try starting the service. – ArnavRay Sep 20 '18 at 17:00
  • The issue is resolved. Arnav I can't thank you enough for your help and attention. – Stpete111 Sep 20 '18 at 21:52
  • Arnav, can we continue a discussion about ElasticSearch in a chat? I have a question for you. – Stpete111 Sep 20 '18 at 22:55
  • Hi @Stpete111, sorry for the late reply. We can continue the discussion on chat. Let me know when you are available. Great to hear this is problem is finally solved. – ArnavRay Sep 21 '18 at 13:47
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/180540/discussion-between-stpete111-and-arnavray). – Stpete111 Sep 21 '18 at 17:34
1

The issue was with the line in the ElasticSearch.yml file which showed as

"10.5.11.6""

That extra quotation mark at the end is what was causing the entire problem.

For anyone that this can benefit, the ElasticSearch.yml file is extremely sensitive when it comes to space, punctuation and case: even an extra space somewhere can cause the entire service to crash. Be very diligent with your edits to elasticsearch.yml.

Stpete111
  • 3,109
  • 4
  • 34
  • 74
0

There are ways to debug:

1. Check if you have ES service running on that particular host via `ps -ef | grep elastic`
2. Look on which port es is listening (or not) ? via netstat 
3. it might be a case that your es is running and but is binding not to localhost but to the instance IP . You should be getting the hint on the elasticsearch.yaml 
4. Make sure your /usr/share/elasticsearch/elasticsearch.yaml is the file that is being picked up and not the default at /etc/elasticsearch.yaml
5. Configure logging in elasticsearch.yaml to the location

Hope this helps?

Nishant Singh
  • 3,055
  • 11
  • 36
  • 74
  • Thanks Nishant - I'm still very new to this so I'll need some more detailed guidance: 1. That command give me a LOT of information, what am I looking for in there? 2. Running netstat -a I see nothing about ElasticSearch in any of the LISTENING rows. Problem there I guess? 3. How can I determine this for sure? I've configured the es.yml file to be consistent with our other instance on the other box which is working correctly. 4. Ok, I'm confused about this - again, on our other box that's working fine, it's using the etc/elasticsearch.yml is the one being used. Are you sure about this? – Stpete111 Sep 19 '18 at 20:38
  • 5. let me take a look at that one and see if there is an issue with that. – Stpete111 Sep 19 '18 at 20:39