0

I have a query which returns the most frequent values of a column and there frequency. (I've trimmed out the meat of this query which makes the CTE helpful)

Query 1

WITH data AS
(
    SELECT question, COUNT(*) AS frequency
    FROM DB1
    GROUP BY 1
    LIMIT 50
)

SELECT question, frequency
FROM data
ORDER BY frequency DESC

Result

Question Frequency
Hello 132,140
World 120,492

Not included in the query but included in the original table is a unique question id for each question, "Hello" on two different occasions would have two different question ids. I want to write a query which returns the ids of the users who asked the most frequent questions by using a different table of question id's and user id's from that interaction.

DB2

Question id User id
12345 2537133
67890 3149172

Query 2 Attempt

WITH data AS
(
    SELECT question, COUNT(*) AS frequency
    FROM DB1
    GROUP BY 1
    LIMIT 50
)

SELECT question_id
FROM DB2
WHERE question_id IN 
            (SELECT question_id FROM DB1 WHERE question IN 
                    (SELECT question FROM data))

I've tried replacing the logic from the WHERE with a JOIN but in order to do that I have to add question_id to the CTE 'data' and doing so causes the first query to fail as I have to group on question and question_id but question_id is unique per question so it just returns 50 random questions.

Cmerdman
  • 1
  • 1

1 Answers1

0
select user_id from db2 where question_id in (
    select question_id from db1 group by question_id
    having count(*) >= all (select count(*) from db1 group by question_id)
);

Or the more modern way:

with freq as (
    select question_id, dense_rank() over (order by count(*) desc) as dr
    from db1 group by question_id
)
select user_id from db2
where question_id in (select question_id from freq where dr = 1);
shawnt00
  • 16,443
  • 3
  • 17
  • 22