Currently, I have several Elasticsearch nodes running on several bare metal machines containing indices at the size of TBs. We're in the process of restructuring our infrastructure and I'm not sure if this is the best way.
I have been looking at Docker, Mesos, and Vagrant as alternatives, but I'm not sure if they are even possible. There are four situations I think are relevant (along with the issue I had):
- Mesos-Elasticsearch: This package runs Elasticsearch on Mesos. This seems great, but it seems it only allows scaling of data nodes at small disk size. Also, there are no master/client nodes. The package is rather alpha on Github at the moment - I received a 'No route to Host' and MasterNotDiscoveredException error on their default setup. Does anybody have experience with this?
- Docker: I'm not too familiar with containers, but Dockerhub has several containers for Elasticsearch. Also, Mesos allows containers to be run on top of it. I'm concerned about the low disk space in each container since my data is in the scale of TBs. Also, the data is persistent. Is resizing the disk of the container feasible or is there a different setup for Docker containers?
- Vagrant VMs: I would imagine having a VM for each ES node being suitable to allocate resources. Is there any substantial benefits to this when compared to running on bare metal? This doesn't seem to be compatible with Mesos.
- Bare-metal: This is the current setup.
I would like to know which of the four is your preferred setup for an Elasticsearch cluster at the TB level. Pros and cons of each option?