2

I'm trying to use Elassandra as a standalone instance locally. Using bin/cqlsh I've created a keyspace and have added a test table to it. I want to create an index on this table to run elasticsearch queries, but I'm not sure how to go about it. I found this information, but it's just one example without really going through the options or what they mean. Can anyone point me in the right direction to index on my table? I've tried going through the ElasticSearch documentation as well with no luck. Thanks in advance.

Snowy Coder Girl
  • 5,408
  • 10
  • 41
  • 72

1 Answers1

4

Yes I admit, Elassandra documentation is far from perfect, and hard for newcomers.

Let's create a keyspace and table and insert some rows :

CREATE KEYSPACE ks WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 1};
CREATE TABLE ks.t (id int PRIMARY KEY, name text);
INSERT INTO ks.t (id, name) VALUES (1, 'foo');
INSERT INTO ks.t (id, name) VALUES (2, 'bar');

NetworkTopologyStrategy is mandatory, SimpleStrategy is not supported.

Mapping all cql types to ES types can be boring, so there is a discover option to generate the mapping :

curl -XPUT -H 'Content-Type: application/json' 'http://localhost:9200/myindex' -d '{
    "settings": { "keyspace":"ks" },
    "mappings": {
        "t" : {
            "discover":".*"
        }
    }
}'

This creates an index named myindex, with a type named t (the cassandra table).

The name of the keyspace must be specified in settings.keyspace (because index name and keyspace name are differents).

The discover field contains a regex. Each cassandra column that matches this regex will be indexed automatically, with type inference.

Let's look at the generated mapping :

{
  "myindex": {
    ...
    "mappings": {
      "t": {
        "properties": {
          "id": {
            "type": "integer",
            "cql_collection": "singleton",
            "cql_partition_key": true,
            "cql_primary_key_order": 0
          },
          "name": {
            "type": "keyword",
            "cql_collection": "singleton"
          }
        }
      }
    },
 ...
}

There is a bunch of special cql_* options here.

For cql_collection, singleton means that the index field is backed by a cassandra scalar column - neither a list or set. This is mandatory because elasticsearch fields are multi-valued.

cql_partition_key, and cql_primary_key_order tell the index which column to use to create the _id field.

barth
  • 431
  • 2
  • 5
  • Thanks so much for the help. +1 One final question - where did Elassandra put the indexing table? From my previous playing around, I got the indexing table in the same keyspace as the basic table, but I can't find it in any of the keyspaces now. I found metadata in `system_schema.indexes` and `system."IndexInfo"`, but haven't found the actual index table. Thanks. – Snowy Coder Girl Feb 11 '19 at 19:03
  • 2
    For future people seeing this: The index mapping shown in the answer can be gotten via `curl -X GET localhost:9200/myindex/_mapping?pretty` – Snowy Coder Girl Feb 11 '19 at 19:04
  • m not sure what you mean with "index table". Index is queried through the es rest api. Internally it is stored in Lucene files. You can use system_schema and elastic_admin keyspaces to collect metadata about indexes. So, index is not stored in a Cassandra table. If you want to list indexes, just curl localhost:9200/_cat/indices – barth Feb 13 '19 at 07:07