1

I'm new to ElasticSearch and am trying to figure out what is the most optimal way to index 1 Terabyte of data in Cassandra.

Two options that I understand right now are:

  1. Move data periodically to ElasticSearch using the Cassandra-River plugin and then run index on the data.
    Advantage: Search queries create no impact on Cassandra load
    Disadvantage: Have to sync the data periodically

  2. Without moving the data run ElasticSearch on Cassandra to index the data (not sure how will this be done).
    Advantage: Data always in sync
    Disadvantage: Impacts Cassandra performance ?

Any thoughts would be appreciated.

Chins
  • 151
  • 1
  • 2
  • 8
  • Similar discussion: http://stackoverflow.com/questions/27054954/elasticsearch-vs-cassandra-vs-elasticsearch-with-cassandra/27072018#27072018 – Aaron May 09 '15 at 12:48

1 Answers1

0

Prehaps in the context of ElasticSearch 1.4 and above.. just using ElasticSearch as a datastore and search engine might be simpler and elegant option. Add more nodes to scale.

Samant
  • 92
  • 1
  • 3
  • 13