2

I've spent an inordinate amount of time this morning trying to Google what I thought would be a simple thing. I need to set up an SQL query that selects multiple columns, but only returns one instance if one of the columns (let's call it case_number) returns duplicate rows.

select case_number, name, date_entered from ticket order by date_entered

There are rows in the ticket table that have duplicate case_number, so I want to eliminate those duplicate rows from the results and only show one instance of them. If I use "select distinct case_number, name, date_entered" it applies the distinct operator to all three fields, instead of just the case_number field. I need that logic to apply to only the case_number field and not all three. If I use "group by case_number having count (*)>1" then it returns only the duplicates, which I don't want.

Any ideas on what to do here are appreciated, thank you so much!

mo3bius
  • 43
  • 4
  • What you are describing doesn't have any logic. To my understanding, keeping case_number distinct is all you need and you don't care what comes in name and date_entered columns, right? – Cetin Basoz Oct 15 '21 at 16:55
  • I'd like to return all three columns so I can put in a spreadsheet, but eliminate duplicate rows that would be caused by duplicate rows with the case_number in the table. Hope that makes sense. – mo3bius Oct 15 '21 at 17:04
  • You should research windowing functions (SQL Partition by) in order to eliminate the duplicates. It should sort your duplicates by your preference and you can rank the sortig. Then only pick rank = 1. – NonProgrammer Oct 15 '21 at 17:08
  • @mo3bius, you didn't understand what I am saying. When there is a case_number duplicate, from which row the other column's content would come? If they were all same "select distinct ..." would be the answer. IOW, define your logic and add some sample data and expected output. – Cetin Basoz Oct 15 '21 at 17:39
  • Why don't you have a central `ticket` table with `case_number` as the primary key? It seems you are actually dealing with a `ticket_contents` table, hence the multiple entries? Maybe you should either use another already existing table, or should consider a redesign of your schema. – Kaii Oct 15 '21 at 17:57
  • Thanks everyone, sorry this was a business application so put together the best example I could using fake data which was not exactly what I was dealing with. I believe I have the results I need using the row_number function. – mo3bius Oct 15 '21 at 22:48
  • Does this answer your question? [Get top 1 row of each group](https://stackoverflow.com/questions/6841605/get-top-1-row-of-each-group) – Charlieface Oct 16 '21 at 20:15

1 Answers1

3

You can use ROW_NUMBER(). For example

select *
from (
  select *,
    row_number() over(partition by case_number) as rn
) x
where rn = 1

The query above will pseudo-randomly pick one row for each case_number. If you want a better selection criteria you can add ORDER BY or window frames to the OVER clause.

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • Thanks so much. I'm trying to get this syntax figured out for my specific application, but I think this is going to be it. – mo3bius Oct 15 '21 at 17:24