12

I have a clickhouse table that has one Array(UInt16) column. I want to be able to filter results from this table to only get rows where the values in the array column are above a threshold value. I've been trying to achieve this using some of the array functions (arrayFilter and arrayExists) but I'm not familiar enough with the SQL/Clickhouse query syntax to get this working.

I've created the table using:

CREATE TABLE IF NOT EXISTS ArrayTest (
    date Date,
    sessionSecond UInt16,
    distance Array(UInt16)
) Engine = MergeTree(date, (date, sessionSecond), 8192);

Where the distance values will be distances from a certain point at a certain amount of seconds (sessionSecond) after the date. I've added some sample values so the table looks like the following:

Table with sample values

Now I want to get all rows which contain distances greater than 7. I found the array operators documentation here and tried the arrayExists function but it's not working how I'd expect. From the documentation, it says that this function "Returns 1 if there is at least one element in 'arr' for which 'func' returns something other than 0. Otherwise, it returns 0". But when I run the query below I get three zeros returned where I should get a 0 and two ones:

SELECT arrayExists(
    val -> val > 7,
    arrayEnumerate(distance))
FROM ArrayTest;

Eventually I want to perform this select and then join it with the table contents to only return rows that have an exists = 1 but I need this first step to work before that. Am I using the arrayExists wrong? What I found more confusing is that when I change the comparison value to 2 I get all 1s back. Can this kind of filtering be achieved using the array functions?

Thanks

egorlitvinenko
  • 2,736
  • 2
  • 16
  • 36
MoshMcCabe
  • 130
  • 1
  • 3
  • 12

2 Answers2

13

You can use arrayExists in the WHERE clause.

SELECT * 
FROM ArrayTest
WHERE arrayExists(x -> x > 7, distance) = 1;

Another way is to use ARRAY JOIN, if you need to know which values is greater than 7:

SELECT d, distance, sessionSecond 
FROM ArrayTest
ARRAY JOIN distance as d
WHERE d > 7
egorlitvinenko
  • 2,736
  • 2
  • 16
  • 36
Mikhail
  • 356
  • 2
  • 6
  • Thanks Mikhail, this works perfectly. I was adding the arrayEnumerate for some reason which I think was the problem. I was also wondering if you can query the indexes in the array too? Like if I wanted to check if distance[1] > 7 or would this be a misuse of the array feature and columns should be used instead? – MoshMcCabe Dec 01 '17 at 17:02
  • 2
    Yes you can check particular element in the array `SELECT * FROM ArrayTest WHERE distance[1] > 5` – Mikhail Dec 01 '17 at 19:25
  • Thanks Mikhail. It's all working as expected for me now. – MoshMcCabe Dec 04 '17 at 10:29
1

I think the reason why you get 3 zeros is that arrayEnumerate enumerates over the array indexes not array values, and since none of your rows have more than 7 elements arrayEnumerates results in 0 for all the rows. To make this work,

SELECT arrayExists(
    val -> distance[val] > 7,
    arrayEnumerate(distance))
FROM ArrayTest;
Narendra Rana
  • 391
  • 3
  • 9