18

I went over the documentation for Clickhouse and I did not see the option to UPDATE nor DELETE. It seems to me its an append only system. Is there a possibility to update existing records or is there some workaround like truncating a partition that has records in it that have changed and then re-insering the entire data for that partition?

Russ Bradberry
  • 10,705
  • 17
  • 69
  • 85
Jonathan
  • 2,183
  • 4
  • 20
  • 25

6 Answers6

22

Through Alter query in clickhouse we can able to delete/update the rows in a table.

For delete: Query should be constructed as

ALTER TABLE testing.Employee DELETE  WHERE  Emp_Name='user4';

For Update: Query should be constructed as

ALTER TABLE testing.employee UPDATE AssignedUser='sunil' where AssignedUser='sunny';
Pavel Smirnov
  • 4,611
  • 3
  • 18
  • 28
Sunil Sunny
  • 531
  • 7
  • 30
19

UPDATE: This answer is no longer true, look at https://stackoverflow.com/a/55298764/3583139

ClickHouse doesn't support real UPDATE/DELETE. But there are few possible workarounds:

  1. Trying to organize data in a way, that is need not to be updated. You could write log of update events to a table, and then calculate reports from that log. So, instead of updating existing records, you append new records to a table.

  2. Using table engine that do data transformation in background during merges. For example, (rather specific) CollapsingMergeTree table engine: https://clickhouse.yandex/reference_en.html#CollapsingMergeTree Also there are ReplacingMergeTree table engine (not documented yet, you could find example in tests: https://github.com/yandex/ClickHouse/blob/master/dbms/tests/queries/0_stateless/00325_replacing_merge_tree.sql) Drawback is that you don't know, when background merge will be done, and will it ever be done.

Also look at samdoj's answer.

uYSIZfoz
  • 911
  • 6
  • 7
  • 2
    from `1.1.54388` release 2018-06-28 you can mutate data. https://clickhouse.yandex/docs/en/changelog/#clickhouse-release-1-1-54388-2018-06-28 – Anton Manevskiy Apr 04 '19 at 10:50
5

You can drop and create new tables, but depending on their size this might be very time consuming. You could do something like this:

For deletion, something like this could work.

  INSERT INTO tableTemp SELECT * from table1 WHERE  rowID != @targetRowID;
    DROP table1;
    INSERT INTO table1 SELECT * from tableTemp;

Similarly, to update a row, you could first delete it in this manner, and then add it.

samdoj
  • 144
  • 8
  • But how would I update a very large table. For example, what if my process was to gather a lot of records, then "correct" 15% of them? – Jonathan Jun 19 '16 at 05:03
  • 1
    Basically just update and delete multiple rows? You could first select those rows into a temp table and just change the where clause to be `WHERE rowID is not in (SELECT rowID FROM targetTable`) You are only limited here by what you can select. – samdoj Jun 19 '16 at 05:18
  • 1
    It would be much more efficient to rename table instead of doing `INSERT INTO table1 SELECT * from tableTemp;` – Darth Kotik Jun 15 '17 at 13:24
5

Functionality to UPDATE or DELETE data has been added in recent ClickHouse releases, but its expensive batch operation which can't be performed too frequently.

See https://clickhouse.yandex/docs/en/query_language/alter/#mutations for more details.

Ivan Blinkov
  • 2,466
  • 15
  • 17
3

It's an old question, but updates are now supported in Clickhouse. Note it's not recommended to do many small changes for performance reasons. But it is possible.

Syntax:

ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr

Clickhouse UPDATE documentation

supernova
  • 1,762
  • 1
  • 14
  • 31
0

To update values you can use the ReplacingMergeTree engine.

Having this engine on a table, if you try to append a row with the sort keys that already exists in the table, it will replace previous record that have the same sort keys (like upsert).

Amir nazary
  • 384
  • 1
  • 7