How to get the last record from the duplicate records in SQL?

Question

I want to get the last record from the duplicate records and want the non-duplicate records also.

As depicted in the below image I want to get row number 4, 5, 7 and 9 in my output.

Here, In the below image the ** Main table** was shown. From which I have to concat first two columns and then from that new column I need the last row of duplicate records and the non-duplicate rows also.

I have tried with the given below SQL code.

DECLARE @dense_rank_demo AS TABLE (
    Bid INT,
    cid INT,
    BCode NVARCHAR(10)
);
    
INSERT INTO @dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),(2394,54,'BRK'),(2395,57,'ONT'),(2393,1,'SAN'),(2393,1,'LAX'),(2393,1,'BRK'),(2394,54,'ONT'),(2395,57,'SAN'),(2394,1,'ONT');

    
SELECT * FROM @dense_rank_demo;

SELECT 
  CONCAT([Bid],'_',[cid]) as [Key],BCode,DENSE_RANK() over( order by CONCAT([Bid],'_',[cid])) 
     
        
    from @dense_rank_demo

is there no date field that you could use to make the logic i.e chose the most recent Bid when there are duplicates ? Otherwise you might not get the row that you want every time — trillion, Nov 11 '22 at 16:11

score 2 · Accepted Answer · edited Nov 17 '22 at 09:45

From the SQL code I found that there is no column on which we can apply order by for getting the expected Result.

So that, I have add one column name Id and done some other changes for getting expected output.

Here I am Sharing the code in which I have done some changes.

DECLARE @dense_rank_demo AS TABLE (
ID INT IDENTITY(1,1),
Bid INT,
cid INT,
BCode NVARCHAR(10));
DECLARE @tableGroupKey TABLE
(
    dr bigint,
    [Key] VARCHAR(50)
)   
INSERT INTO @dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),
(2394,54,'BRK'),
(2395,57,'ONT'),
(2393,1,'SAN'),
(2393,1,'LAX'),
(2393,1,'BRK'),
(2394,54,'ONT'),
(2395,57,'SAN'),
(2394,1,'ONT');
with [drd] as
(
   select
    concat([Bid],'_',[cid]) as [Key],
    BCode,
    dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as 
[dr]   
  from @dense_rank_demo
)
INSERT INTO @tableGroupKey(dr,[Key])
select MAX(dr) dr,[Key]
from [drd]
GROUP BY [Key]
SELECT *,CONCAT(Bid,'_',cid) AS [Key] FROM  @dense_rank_demo [drd]
select Result.* FROM
(
  SELECT *,CONCAT(Bid,'_',cid) AS [Key] ,
  dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as 
[dr]  
FROM @dense_rank_demo [drd]
) as [Result]
INNER JOIN @tableGroupKey [gk] ON
[Result].[Key] = [gk].[Key] AND [gk].dr = [Result].dr
ORDER BY [Result].ID

The Expected output is as below:

[Output]

score 0 · Answer 2 · answered Nov 11 '22 at 16:06

The issue here is the ordering of the values within the result set. If you had a specific order to use, this would be fairly straightforward - however, you are relying on dense_rank() to consistently and reliably returning the same values for those in the table. If we could use, for example, the alpha sort on the BCode column then it would be simple to use a CTE and get the last/first one:

with [drd] as
(
    select
        concat([Bid],'_',[cid]) as [Key],
        BCode,
        dense_rank() over(partition by concat([Bid],'_',[cid]) order by Bcode desc) as [dr]   
    from @dense_rank_demo
)
select *
from [drd]
where dr = 1

As the order of dense_rank() is not guaranteed in your code, I'm not sure that this is feasible in a scalable way.

See this for more information about reliably sorted results: how does SELECT TOP works when no order by is specified?

score 0 · Answer 3 · answered Nov 11 '22 at 16:19

you need one row per BID i.e the latest one, But you have not specified the logic of the last row. Usually, last row is the most recent added one and so there is usually a timestamp that can be used to pick the latest row where there are duplicates.
The code below uses the Bcode as a part of the order by calause, that means it will automatically pick the row that has the lowest alphabet order, which not be the row that you expect unless thats how you define the most recent row. You would in general need to play with the order by clause based on your needs but the timestamp makes most sense
row_number() generates the values 1-n based on the partition by, incase there is a tie, and you need both rows, then you need to use the dense_rank instead. Based on your needs you can adjust that

with main as (

select 
concat(Bid, cid) as key,
row_number() over(partition by concat(Bid, cid) order by Bcode) as rank_
from <table_name>
)

select * from main where rank_ = 1

How to get the last record from the duplicate records in SQL?

3 Answers3