I got table T
with 2 column as follow example:
C1 C2
----------
A x
A x
A y
B x
B x
I want to count number of distinct C1
for each value in C2
. This result should be like:
C1 distinct count
----------------------
A 2 // count distinct x,x,y = 2
B 1 // count distinct x,x = 1
it is easy to come out with a SQL query like this
select C1, count(distinct C2) from T group by C1
however, as discussed in postgresql COUNT(DISTINCT …) very slow, this query yield poor performance. I would like to use the improved query (count (*) (select distinct ...)
) as suggested in that article but I don't know how to form the query with group by.