0

Here is an odd request. I have a table of around 10 million rows. In those rows are sourcename, sourceState and date. What I am trying to do is create a list of non-duplicate rows that basically has every sourcename, sourcestate and date... but the earliest of each one.

So and example... lets say I have a row that the sourcestate is FL and the source is Broward County and the earliets record it appeared in the database is 19960912 (thats how the date is and its in a varchar column) , but there are 1000;s of other FL and Broward County entries, I need them all thrown out leaving only the unique sourcestate, sourcename and date only... for each unique sourcename that exists in the table.. .which is about 500 or so.

Sal
  • 295
  • 3
  • 5
  • 13
  • Possible duplicate of [SQL: How To Select Earliest Row](http://stackoverflow.com/questions/5736820/sql-how-to-select-earliest-row) – Tab Alleman Mar 08 '16 at 19:40

2 Answers2

1

What comes to mind is row_number():

select t.*
from (select t.*,
             row_number() over (partition by sourcename order by date asc) as seqnum
      from t
     ) t
where seqnum = 1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • The question is poorly worded in this regard, but I think OP needs results grouped by Name and State, and not just Name. Notice he mentions "unique sourcestate" in the sentence before "unique sourcename". Again, poorly worded, to be sure. – Tab Alleman Mar 08 '16 at 19:43
  • Will this delete all the data other than the earliest sourcename and date? Also I should of mentioned the table name is PEOPLE – Sal Mar 08 '16 at 19:45
  • This will return the earliest occurance of each unique sourcename, and if there are two states with the same sourcename, only _one_ of those states will be returned. – Tab Alleman Mar 08 '16 at 19:46
  • Yes sorry for the poor wording on my part :( But yes, I'd like to only have left in the PEOPLE table a unique sourcename, sourcestate and the earliest date for each sourcename and sourcestate. Again.. poorly worded on my end and I apologize – Sal Mar 08 '16 at 19:47
  • Ok, this answer does not consider sourcestate, so it will not do what you want. The answer in the linked duplicate question will do that. – Tab Alleman Mar 08 '16 at 19:48
  • @sal . . . Put the columns you want to be unique in the `partition by`. Your question is quite clear that you only want `sourcename`: "for each unique sourcename that exists in the table." – Gordon Linoff Mar 09 '16 at 02:11
-1

Try this

    SELECT DISTINCT SOURCENAME, SOURCESTATE, MIN(DATE) FROM TABLE_A 
    GROUP BY SOURCENAME, SOURCSTATE
Shakeer Mirza
  • 5,054
  • 2
  • 18
  • 41