EDITED - Based on comments of @opster elasticsearch ninja, I edited original question to keep it focused on low disk watermarks error for ES.
For more general server optimization on small machine, see: Debugging Elasticsearch and tuning on small server, single node
For original follow up on the original question and considerations related to debugging ES failures, also: https://chat.stackoverflow.com/rooms/213776/discussion-between-opster-elasticsearch-ninja-and-user305883
Problem : I noticed that elasticsearch is failing frequently, and need to restart the server manually.
This question may relate to: High disk watermark exceeded even when there is not much data in my index
I want to have a better understanding about what elasticsearch will do if the disk size fails, how to optimise configuration and only afterwards eventually restart automatically when system fails.
Could you help in understanding how to read the elasticsearch journal and make a choice to fix the problems accordingly, suggesting best practices to tune server ops on a small server machine ?
My priority is not to have system crash; it is ok to have a bit less performance, no budget to increase server size.
Hardware
I am running elasticsearch on a single small server (2GB), have 3 index (500mb, 20mb and 65mb of store size) and several GB free on disk (state solid) : I would like to allow use virtual memory VS consuming RAM.
Below what I did:
What does the journal say?
journalctl | grep elasticsearch
> explore failures related to ES.
May 13 05:44:15 ubuntu systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
May 13 05:44:15 ubuntu systemd[1]: elasticsearch.service: Unit entered failed state.
May 13 05:44:15 ubuntu systemd[1]: elasticsearch.service: Failed with result 'signal'.
Here I can see ES was killed.
EDITED : I have found due to out of memory error from java, see below error in service elasticsearch status
; readers may also find useful to run:
java -XX:+PrintFlagsFinal -version | grep -iE 'HeapSize|PermSize|ThreadStackSize'
to check current memory assignment.
What does the ES log say?
check:
/var/log/elasticsearch
[2020-05-09T14:17:48,766][WARN ][o.e.c.r.a.DiskThresholdMonitor] [my_clustername-master] high disk watermark [90%] exceeded on [Ynm6YG-MQyevaDqT2n9OeA][awesome3-master][/var/lib/elasticsearch/nodes/0] free: 1.7gb[7.6%], shards will be relocated away from this node
[2020-05-09T14:17:48,766][INFO ][o.e.c.r.a.DiskThresholdMonitor] [my_clustername-master] rerouting shards: [high disk watermark exceeded on one or more nodes]
what does "shards will be relocated away from this node" if I only have one server and one instance working ?
service elasticsearch status
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2020-05-09 13:47:02 UTC; 32min ago
Docs: http://www.elastic.co
Process: 22691 ExecStartPre=/usr/share/elasticsearch/bin/elasticsearch-systemd-pre-exec (code=exited, status=0/SUCCES
Main PID: 22694 (java)
CGroup: /system.slice/elasticsearch.service
└─22694 /usr/bin/java -Xms512m -Xmx512m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+U
What does my configuration say ?
I am using a default configuration of `/etc/elasticsearch/elasticsearch.yml´
and don't have any options configured for watermark, like in https://stackoverflow.com/a/52006486/305883
Should I include them ? What would they do ?
Please note I have uncommented #bootstrap.memory_lock: true
because I only have 2gb of ram.
Even if elasticsearch will perform poorly if memory is swapping, my priority is that it does not fail, and the sites stays up and running.
Running on a Single node machine - how to handle unassigned replicas ?
I understood that replicas cannot be assigned on the same nodes. As a consequence, does it make sense to have replicas on a single node ? If a primary index will fail, replicas will come to rescue or will they be unused anyway ?
I wonder if I should delete them and make space, or better not to.