SQL: How to select distinct on some columns

Question

I have a table looking something like this:

+---+------------+----------+
|ID | SomeNumber | SomeText |
+---+------------+----------+
|1  | 100        | 'hey'    |
|2  | 100        | 'yo'     |
|3  | 100        | 'yo'     | <- Second occurrence
|4  | 200        | 'ey'     |
|5  | 200        | 'hello'  |
|6  | 200        | 'hello'  | <- Second occurrence
|7  | 300        | 'hey'    | <- Single
+---+------------+----------+

I would like to extract the rows where SomeNumber appears more than ones, and SomeNumbers and SomeText are distinct. That means I would like the following:

+---+------------+----------+
|ID | SomeNumber | SomeText |
+---+------------+----------+
|1  | 100        | 'hey'    |
|2  | 100        | 'yo'     |
|4  | 200        | 'ey'     |
|5  | 200        | 'hello'  |
+---+------------+----------+

I don't know what to do here.

I need something along the lines:

SELECT t.ID, DISTINCT(t.SomeNumber, t.SomeText)  --this is not possible
FROM (
    SELECT mt.ID, mt.SomeNumber, mt.SomeText
    FROM MyTable mt
    GROUP BY mt.SomeNumber, mt.SomeText --can't without mt.ID
    HAVING COUNT(*) > 1
)

Any suggestions?

I removed the incompatible database tags. Please tag with the database you are really using. — Gordon Linoff, Dec 23 '16 at 13:07
Why does `100 'hey'` appear in your result set, it being a "single?" — Tim Biegeleisen, Dec 23 '16 at 13:08
I need all the numbers, unless they only appear ones. If the text appears twice on the same number, only one must remain. — Rasmus Bækgaard, Dec 23 '16 at 13:23

Zohar Peled · Accepted Answer · 2016-12-23T14:35:00.567

4

Using a cte with row number and count rows might get you what you need:

Create and populate sample table (Please save us this step in your future questions):

CREATE TABLE MyTable(id int, somenumber int, sometext varchar(10));
INSERT INTO MyTable VALUES
(1,100,'hey'),
(2,100,'yo'),
(3,100,'yo'),
(4,200,'ey'),
(5,200,'hello'),
(6,200,'hello'),
(7,300,'hey');

The query:

;WITH cte as 
(
    SELECT id, 
           someNumber,
           someText, 
           ROW_NUMBER() OVER (PARTITION BY someNumber, someText ORDER BY ID) rn, 
           COUNT(id) OVER (PARTITION BY someNumber) rc
    FROM MyTable
)
SELECT id, someNumber, someText
FROM cte
WHERE rn = 1
AND rc > 1

Results:

id  someNumber  someText
1   100         hey 
2   100         yo  
4   200         ey  
5   200         hello

edited Dec 23 '16 at 14:35

answered Dec 23 '16 at 14:11

Zohar Peled

79,642
10
69
121

I really need to look into that ``PARTITION`` function. It works - I take it :) Thank you, @zohar-peled – Rasmus Bækgaard Dec 23 '16 at 15:21
Hi Zohar, I had a similar question, and I can't understand the role of `PARTITION BY` and `ROW_NUMBER()` I have 80 columns and I want to gt distinct on the last 72 columns, the first 8 columns doesn't contain PK or ID (it's a view), and I can't decide what columns to choose in the two partition statements, could you help me? – mshwf Oct 09 '17 at 07:25
1

Row_number() simply create a number for each row based on the order by clause (and partition by, if it's specified). If partition by is specified, then each set of distinct value(s) in the partition by clause gets it's own numbers. Take a look at this [simple example.](http://rextester.com/OEHZ99221) – Zohar Peled Oct 09 '17 at 07:52
Thanks, that's what I came up with: http://rextester.com/VXP42578 could you have a look? – mshwf Oct 09 '17 at 09:02
I can't tell if that's ok or not, I have no sample data nor desired results. You should probably post your own question, with sample data and desired results. – Zohar Peled Oct 09 '17 at 09:17

SQL: How to select distinct on some columns

1 Answers1