0

this is my first post since I am convinced there is a better solution than mine. My question is rather a design question.

I use Spring Boot 2.1.x to store a user entity in a Cassandra database. This works well so far. It is stored with

  • java generated uuid
  • mail address
  • salted bcrypt password
  • some user defined types...

Well, in case somebody uses the login I will get the mail and the password to get the credentials.

For retrieving the user object I would expect some WHERE-clause with "select * from user where username = mail".

However, in this case mail must be the partition key in Cassandra. But, I want the user to be able to change her/his mail address and then it may not be part of the primary key of a Cassandra table.

My naive idea is to have an inverse table with a tuple (mail, java generated uuid) to lookup the user and then to load the user with the uuid.

I am just learning about handling Cassandra properly but in IMHO my design is crap.

This is what I have in my user bean.

@PrimaryKeyColumn(type = PrimaryKeyType.PARTITIONED, ordinal = 0, name = "id")
@JsonProperty("id")
private String id;

@PrimaryKeyColumn(type = PrimaryKeyType.CLUSTERED, ordinal = 1, name = "email")
@Email(message = "*Please provide a valid email")
@NotEmpty(message = "*Please provide an email")
@JsonProperty("email")
private String email;
Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
msek
  • 181
  • 1
  • 5
  • Have you considered mail as a normal column and use a secondary index on it? – Horia Jan 28 '19 at 16:18
  • Dear Horia, many thanks for your feedback. This seems to work. I was not aware and use now SASI index on the email field. - many thanks. – msek Jan 28 '19 at 17:00
  • Secondary indexes are different from SASI. SASI are currently marked as experimental and have some issues. Is not advisable to use it in production. Regarding secondary indexes, there are some drawbacks also: a query that would contain secondary indexes will be executed across many nodes, since each node will have its own index for the data that it owns. Also, there are some specific use cases that secondary indexes should not be used on: columns with high cardinality, columns with very low cardinality, columns that are frequently updated or deleted. – Horia Jan 28 '19 at 17:30
  • I would read some more on this matter - [Cassandra at Scale: The Problem with Secondary Indexes](https://pantheon.io/blog/cassandra-scale-problem-secondary-indexes), [Cassandra Native Secondary Index Deep Dive](https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive) – Horia Jan 28 '19 at 17:31
  • And some reading regarding SASI - [Cassandra SASI Index Technical Deep Dive](http://www.doanduyhai.com/blog/?p=2058) – Horia Jan 28 '19 at 17:32
  • Dear Horia, many thanks for your feedback again. Of course, the mail column will have a high cardinality but I did not expect that my questions is so extraordinary. I thought the problem is quite simple. – msek Jan 29 '19 at 08:03
  • BTW the article at pantheon.io covers my question as a use case. – msek Jan 29 '19 at 08:12
  • Dealing with the same question - https://stackoverflow.com/questions/25124993/how-to-avoid-secondary-indexes-in-cassandra .. and concluding with the same proposal I did. :/ – msek Jan 29 '19 at 08:36

1 Answers1

0

I just want to mention that this topic basically deals with the justification of materialized views in Cassandra. I tried to solve it before with a custom aspect with annotations but in future I will use materialized views.

msek
  • 181
  • 1
  • 5