0

We are starting designing a cluster and come up with the following optimal configuration . Please suggest if there is any scope for improvement or save some budget if it is over optimized.

  1. 100 fields - 1 MB per field(including inverted index) - 125 MB( to be safer side) per document Total - 4M documents corresponds to 500 GB - 25GB per shard - 20 shards - total 40 shards including ( r = 1) - we had seen somewhere a shard size of 25GB works well in most of the scenarios.

  2. Also it seems heap max 32 GB per each shard RAM) ( jvm uses compressed pointers ) works well - which translates to 64 GB (rest 50% for FS cache) .So , considering 256 GB RAM - this translates 2 shards per machine (128GB) - this translates 20 Data nodes per cluster(2 shards per each data node ) , 3 master nodes (HA ) , 1 coordinating node

Please add your recommendations

Nag
  • 1,818
  • 3
  • 24
  • 41
  • can you also provide functional and non-functional requirements as explained in https://stackoverflow.com/a/60584211/4039431 SO answer ? – Amit May 22 '20 at 03:56
  • please consider them as reasonable and please shed some details – Nag May 22 '20 at 04:02

0 Answers0