I need details from both performance and query aspects, I learnt from some site that only a key can be given when using a columnfamily, if so what would you suggest for my keyspace, I need to use group by, order by, count, sum, ifnull, concat, joins, and some times nested queries.
6 Answers
To answer the original question you posed: a column family and a table are the same thing.
- The name "column family" was used in the older Thrift API.
- The name "table" is used in the newer CQL API.
More info on the APIs can be found here: http://wiki.apache.org/cassandra/API
If you need to use "group by,order by,count,sum,ifnull,concat ,joins and some times nested querys" as you state then you probably don't want to use Cassandra, since it doesn't support most of those.
CQL supports COUNT
, but only up to 10000. It supports ORDER BY
, but only on clustering keys. The other things you mention are not supported at all.

- 64,401
- 14
- 110
- 109
-
It's not strictly true that count is supported only up to 10,000. It works up to the query limit (which is 10,000 by default, but can be explicitly defined). That being said, you probably shouldn't use it for performance reasons. – Aurand Sep 16 '13 at 22:30
-
Hi,I refered this link http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/ ,but group by is getting error for me in cqlsh>select count(*) from event_log group by date;I learned that inserting a data in cassandra is much more fasted then mysql is it so? – kumar Sep 17 '13 at 05:11
-
2That is because `group by` is not valid CQL. You cannot just run random SQL statements and expect them to work. – Aurand Sep 17 '13 at 14:42
-
@Aurand after a long way of understanding the Cassandra model,we have finalized to use Elastic Search(lucene) as an secondary storage level for all my aggregate functions,group by and order by function.still nested query not support much in ES its ok to have in my production. – kumar Aug 12 '14 at 07:26
-
Broken links: Thrift API, CQL API, http://wiki.apache.org/cassandra/API . Some possible new ones: [CQL Syntax](https://cassandra.apache.org/doc/latest/cql/index.html), [Drivers API](https://docs.datastax.com/en/landing_page/doc/landing_page/apiDocs.html#DSEDriversAPI) – Curtis Yallop May 20 '20 at 19:40
Refer the document: https://cassandra.apache.org/doc/old/CQL-3.0.html
It specifies that the LRM of the CQL supports TABLE keyword wherever COLUMNFAMILY is supported.
This is a proof that TABLE and COLUMNFAMILY are synonyms.

- 3,807
- 5
- 43
- 71
In cassandra there is no difference between table and columnfamily. they are one concept.

- 331
- 3
- 8
For Cassandra 3+ and cqlsh 5.0.1
To verify, enter into a cqlsh prompt within keyspace (ksp):
CREATE COLUMNFAMILY myTable (
... id text,
... name int
);
And type 'desc myTable'.
You'll see:
CREATE TABLE ksp.myTable (
... id text,
... name int
);
They are synonyms, and Cassandra uses table by default.

- 5,223
- 12
- 28
- 46

- 21
- 1
here small example to understands concept. A keyspace is an object that holds the column families, user defined types.
Create keyspace University with replication={'class':SimpleStrategy, 'replication_factor': 3};
create table University.student(roll int Primary KEY, dept text, name text, semester int)
'Create table', table 'Student' will be created in the keyspace 'University' with columns RollNo, Name and dept. RollNo is the primary key. RollNo is also a partition key. All the data will be in the single partition.
Key aspects while altering Keyspace in Cassandra
Keyspace Name: Keyspace name cannot be altered in Cassandra.
Strategy Name: Strategy name can be altered by specifying new strategy name.
Replication Factor: Replication factor can be altered by specifying new replication factor. DURABLE_WRITES :DURABLE_WRITES value can be altered by specifying its value true/false. By default, it is true. If set to false, no updates will be written to the commit log and vice versa.
Execution: Here is the snapshot of the executed command "Alter Keyspace" that alters the keyspace strategy from 'SimpleStrategy' to 'NetworkTopologyStrategy' and replication factor from 3 to 1 for DataCenter1.

- 3,383
- 1
- 24
- 23
Column family are somewhat related to relational database's table, with a distribution differences and maybe even idealistic character.
Imaging you have a user entity that might contain 15 column, in a relational db you might want to divide the columns into small-related-column-based struct that we all know as Table. In distributed db such as Cassandra you'll be able to concatenate all those tables entry into a single long row, so if you'll use profiler/ db manager you'll see a single table with 15 columns instead of 2/3 tables. Another interesting thing is that every column family is written to different nodes, maybe on different cluster and be recognized by the row key, meaning that you'll have a single key to all the columns family and won't need to maintain a PK or FK for every table and maintain the relationships between them with 1-1, 1-n, n-n relations. Easy!

- 19,179
- 10
- 84
- 156

- 3,092
- 6
- 38
- 69