One column in my data is sample
, another is category
. Duplicate are allowed. Number of unique categories I currently have is 5. Here is a simplified example:
sample category other_columns
122 a
123 a
124 a
125 a
123 b
124 b
125 b
122 c
123 c
124 c
... ...
I need to select only those samples that exist in all categories (122 is not in 'b' and 125 is not in 'c').
sample category
123 a
124 a
123 b
124 b
123 c
124 c
So, if I run
SELECT category, COUNT(DISTINCT sample, category)
FROM my_table
GROUP BY category
all counts should be the same.