1

I has a table with schema:

create table last_message_by_group
(
    date        date,
    created_at  timestamp,
    message     text,
    group_id    bigint,
    primary key (date, created_at, message_id)
) 
  with clustering order by (created_at desc)

and data should be:

|     date    | created_at | message | group_id |
|  2021-05-11 |  7:23:54   | ddd     |    1     | 
|  2021-05-11 |  6:21:43   | ccc     |    1     | 
|  2021-05-11 |  5:35:16   | bbb     |    2     |
|  2021-05-11 |  4:38:23   | aaa     |    2     | 

It will show messages order by created_at desc partition by date.

But the problem is it can not get last message each group likes:

|     date    | created_at | message | group_id |
|  2021-05-11 |  7:23:54   | ddd     |    1     | 
|  2021-05-11 |  5:35:16   | bbb     |    2     |

created_at is cluster key, so it cant be updated, so I delete and insert new row every new message by group_id, this way make low performance

Is there any way to do that?

Aaron
  • 55,518
  • 11
  • 116
  • 132
Jacky Mao
  • 13
  • 3

1 Answers1

0

I was able to get this to work by making one change to your primary key definition. I added group_id as the first clustering key:

PRIMARY KEY (date, group_id, created_at, message_id)

After inserting the same data, this works:

> SELECT date, group_id, max(created_at), message
  FROM last_message_by_group
  WHERE date='2021-05-11'
  GROUP BY date,group_id;

 date       | group_id | system.max(created_at)          | message
------------+----------+---------------------------------+---------
 2021-05-11 |        1 | 2021-05-11 12:23:54.000000+0000 |     ddd
 2021-05-11 |        2 | 2021-05-11 10:35:16.000000+0000 |     bbb

(2 rows)

There's more detail on using CQL's GROUP BY clause in the official docs.

there is one problem, because you changed clustering key, so message will be ordered by group_id first. Any idea for still order by created_at and 1 message each group?

From the document linked above:

the GROUP BY option only accept as arguments primary key column names in the primary key order.

Unfortunately, if we were to adjust the primary key definition to put created_at before group_id, we would also have to group by created_at. That would create a "group" for each unique created_at, which negates the idea behind group_id.

In this case, you may have to decide between having the grouped results in a particular order vs. having them grouped at all. It might also be possible to group the results, but then re-order them appropriately on the application side.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • Thank you, it seems the answer I'm looking for. I wonder we can make group by in materialize view? – Jacky Mao May 12 '21 at 15:53
  • @JackyMao Just tried it, and "no" it doesn't look like `GROUP BY` works in a materialized view. – Aaron May 12 '21 at 18:30
  • there is one problem, because you changed clustering key, so message will be ordered by group_id first. Any idea for still order by created_at and 1 message each group? – Jacky Mao May 13 '21 at 03:18