Lists in NoSQL/BigTable Data Modeling & Super Columns (with Cassandra)

Question

I'm new to NoSQL and BigTable, and I'm trying to learn how I can (and if should) use super columns to create a BigTable friendly schema.

Based on this article about NoSQL data modeling, it sounds like instead of using JOIN-centric RDBMS schemas, I should aggregate my data into larger tables to de-normalize where possible. Based on that, here's a simple schema I envisioned for a 'User', which I'm trying to create for Cassandra:

User: {
    KEY: UserId {
        name: {
           first, 
           last
        },
        age,
        gender
    }
};

The above column family (User), whose key is a 'UserID', is composed of 3 columns (name, age, gender.) Its column 'name' would be a super column who is composed of 'first' and 'last' columns.

So what I'm asking is:

~~What does the CQL 3.0 look like to create this column family 'User' with the 'name' super column within it?~~ (Update: This doesn't appear possible.)
Should I be using super columns (like this)? Should I be using something else?
What's an alternative way of representing this schema?
How do I represent a list of values in a table/column family?

Here are some useful links about this that I found, but that I don't quite understand clearly enough to answer my question:

Thanks!

Update:

After alot of research, I'm learning a few things:

You cannot create super columns using CQL; there might be other mechanisms to do so, but CQL does not appear to be one of them.
Syntax for SQL 3.0 seems to be drifting from a 'COLUMN FAMILY'-centric approach towards SQL-like 'TABLE' based syntax.

Changed my questions accordingly.

score 0 · Answer 1 · edited May 23 '17 at 11:56

Should I be using super columns (like this)? Should I be using something else?

You can use that data model that you suggested. But generally it is not recommended for these reason as mentioned in the link.

I'll also note that use of super columns is generally discouraged as they have several disadvantages. All subcolumns in a super column need to be deserialized when reading one sub column and you can not set secondary indexes on super columns. They also only support one level of nesting.

Hence consider these reasons for your situation.

What's an alternative way of representing this schema?

You can try using composite columns. Read here for more information. Or you can probably just use standard column family, I think standard cf will be suitable for your situation. For example, following suggestion:

User : {
  key: userId {
    columnName:firstname
    ColumnName:lastname 
    ColumnName:age
    ColumnName:gender
    ColumnName:zip
    ColumnName:street
  }
  ..
};

How do I represent a list of values in a table/column family?

It is possible to store the list in a BytesType in the cf. Or you can probably break the list into individual element and store as CompositeType.

Lists in NoSQL/BigTable Data Modeling & Super Columns (with Cassandra)

1 Answers1