DCE Cassandra 3.9 slow secondary index creation during joining existing cluster

Question

We have cassandra cluster with 32 nodes, average node size is about 1TB. Node configuration 1xIntel Xeon E3-1271v3, 32GB ram, 2x3TB HDD. We have one DB with some small tables and one big table, that holds is about 90-95% of total cluster size.

I try to add additional nodes to this cluster, but suddenly find out, that adding one node to existing cluster take is about 13-14 days for joining to cluster. Build secondary indexes take most of this time and all this time i see that all compactor threads take all available CPU.

I have changed cassandra config to extends limits:

concurrent_compactors: 4
compaction_throughput_mb_per_sec: 0

Cassandra full config

Schema

Is about 1 year ago we also add new nodes to this cluster and extend it from 16 nodes to 32 nodes cluster, average node size was 1TB before cluster extends. Cassandra version was 2.1. One node joining time was 1-1.5days.

So the question how can we speed up this process ? Did we miss something ?

Thanks.

Could you think of a better schema without secondary index ? Denormalization might help. — Adrien Piquerez, Apr 14 '17 at 07:14
@DineMartine is basically right just out of curiosity could you add data schema and your access queries to the question? There are more than enough materials and answers on stack overflow to advise not to do it: http://stackoverflow.com/questions/43367076/cassandra-cqlsh-not-working-with-where-clause-on-non-partition-key — Marko Švaljek, Apr 14 '17 at 11:39
@marko-Švaljek we didn't have any issues with queries. Right now we have slow index build during new node bootstrap, for example next 2 index builds runs is about 2 days each, so can we speed up this process ?: 63944d90-196e-11e7-bfc7-f36cff62987e Secondary index build keyspace docs 1348751623 1377995424 bytes 97.88% 8de03eb0-196e-11e7-bfc7-f36cff62987e Secondary index build keyspace docs 1145629997 1236396184 bytes 92.66% — Eugene Zhuzhnev, Apr 14 '17 at 12:41
queries use indexes in the end, using a lot of secondary indexes usually means something's wrong with the data model. Basically I'm just trying to figure out why they are even used in the first place. I understand that this is a lot of work to rewrite etc. But it might make sense to step back and revisit your queries all together because secondary indexes are just not scaleable most of the time. There are a lot of stories on this https://pantheon.io/blog/cassandra-scale-problem-secondary-indexes tl;dr secondary index has to be built for all the data on every node in the cluster — Marko Švaljek, Apr 14 '17 at 12:52
Basically this won't get any better no matter how much you tune it ... — Marko Švaljek, Apr 14 '17 at 12:54

score 0 · Answer 1 · answered Apr 14 '17 at 16:55

This one is a bit longer so I can't put it into comment ... sorry.

I know that this sounds a bit strange, especially for a later stage of your project, but the thing is with the indexes the situation won't get any better over time. I would strongly recommend to start making your own tables instead of just putting index on following stuff. Depending on how often the data is accessed you can use "inverted indexes".

CREATE INDEX links_by_author_url_idx ON keyspace.links_by_author (url);


CREATE INDEX docs_url_idx ON keyspace.docs (url);


CREATE INDEX om_master_object_id_idx ON keyspace.om (master_object_id);


CREATE INDEX actions_pday_idx ON keyspace.actions (pday);


CREATE INDEX authors_yauid_idx ON keyspace.authors (yauid);

CREATE INDEX authors_login_lr_idx ON keyspace.authors (login_lr);

CREATE INDEX authors_login_idx ON keyspace.authors (login);

CREATE INDEX authors_email_idx ON keyspace.authors (email);

CREATE INDEX authors_name_idx ON keyspace.authors (name);

Basically every index that you have here enables you to "search" over base entities to find them by some condition. Most of the conditions are actually pretty narrow which is a good news. But the thing is the indexes will become massive (already did), especially on docs and authors. But I guess doc's is more problematic.

You should consider making separate tables for this. Every index that you create will be there on every node in the cluster and in the end you will hold far more data than you really need too because under the hood data is multiplied per node. When you add replication factor to this system is using a lot of space without you even being aware.

The problem with joining nodes is that when they receive new data all the data in the cluster needs to be rebuilt ... for every single node in the cluster and this is costing you a lot of time. So basically you loose all the benefits of "easy node joining" that cassandra has.

Now you might think that space will become problem when you write the data into your new schema that is denormalized ....

If space is the problem you can use a technique called inverted indexes where you just put the id of the information into the search table and then you make second load in the main table. I used this on some project where space was the issue but since you have all the main stuff indexed space will probably not be a problem because you are already using a lot more than you think. (my bet would be that you will also probably save significantly on space)

Anyway all the indexes should become tables ... if consistency is problem, use batches (don't use materialized views yet because you might loose data).

My honest tip is that you stay away from indexes. I know it's hell to refactor this plus it's hard to get the time to refactor :( But I think it should be manageable.

DCE Cassandra 3.9 slow secondary index creation during joining existing cluster

1 Answers1