I'd like to have a table, which only contains the top n rows per group. Consider this table:
Field | Type
--------------------------------------------
a | VARCHAR(STRING) (primary key)
b | VARCHAR(STRING) (primary key)
c | VARCHAR(STRING) (primary key)
d | INTEGER
For every group (denoted by the primary key), I need e. g. the 10 rows with the highest value in column d. d is an aggregation over {a, b, c}, which sums up column c. This works pretty easy in normal SQL with ROW_NUMBER()
, as described here: How do I use ROW_NUMBER()? , where you simply assign a number to every row, which depicts the row's placement in descending order depending on the value of column d.
Unfortunately, ksql doesn't support subqueries yet, which you need for ROW_NUMBER()
. https://github.com/confluentinc/ksql/issues/745 I'm also not quite sure whether ksql supports ROW_NUMBER()
, I think not, since I haven't found anything in the documentation and didn't manage to run it by myself.
I also found the TOPK
function in ksql, but that doesn't seem to work as expected. https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/aggregate-functions/#topk
There is an issue about it on GitHub https://github.com/confluentinc/ksql/issues/403 . When I run it by myself, I obtain a column with an array with the top n values for {a, b}, which I can't map to the according values of column c. So that doesn't work as well. Example of what TOPK
yields for k = 5:
a | b | d
--------------------------------------------
val1 | val2 | [10, 8, 6, 4, 2]
val1 | val4 | [7, 3, 3, 1, 0]
val1 | val6 | [5, 4, 3, 2, 1]
Here is an example of what I actually need, assuming that {10, 8, 6, 4, 2} are the 5 biggest values of d. For {a, b}, I want the values of c with the 5 biggest values of d.
a | b | c | d
--------------------------------------------
val1 | val2 | val3 | 10
val1 | val2 | val8 | 8
val1 | val2 | val9 | 6
val1 | val2 | val10 | 4
val1 | val2 | val11 | 2
Now, is there any possibility to do a top-n query in ksql? Or is it on the roadmap for future releases? Thanks.