How to ignore duplicate records in sybase?

Question

I'm trying to retrieve only unique records from a table, but I guess something is wrong with my query.

select distinct RIID, duplicateInfo from duplicateRecords where RIID > 3920011

When I execute above query I get this result

RIID   |   duplicateInfo 
___________________________________
3920011    Repeated:12009:CLEAR
3920011    Repeated:12012:CLEAR
4233901    Repeated:18129:HIT
4820129    Repeated:22901:PENDING
4820129    Repeated:22983:PENDING

And I want the below result

RIID   |   duplicateInfo 
___________________________________    
3920011    Repeated:12012:CLEAR
4233901    Repeated:18129:HIT
4820129    Repeated:22983:PENDING

Please any help would be highly appreciated.

Thanks

because duplicateInfo is not distinct – McNets Nov 22 '16 at 22:25 — McNets, Nov 22 '16 at 22:25
Any suggestion, how would I achieve my task? – Fazil Mir Nov 22 '16 at 22:27 — Fazil Mir, Nov 22 '16 at 22:27

McNets · Answer 1 · 2016-11-23T00:14:59.017

1

select distinct RRID,
    (select duplicateInfo 
    from duplicateRecords m 
    where m.RIID = duplicateRecords.RRID 
    having cast(substring(duplicateInfoNumber,10,6) as int) = min(cast(substring(duplicateInfoNumber,10,6) as int)))
from duplicateRecords
where RRID > 3920011

edited Nov 23 '16 at 00:14

answered Nov 22 '16 at 22:27

McNets

10,352
3
32
61

Yes, let me modify, you want descending ordered by duplicateInfo. – McNets Nov 22 '16 at 22:30
I tried desc order as well. But it seems to be no luck. – Fazil Mir Nov 22 '16 at 22:32
for the same result you should use where `RRID >= 3920011` – McNets Nov 22 '16 at 22:45
it should. http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.doc.ase_docs_12.5.3.newfeatures1253_rev/html/newfeatures1253_rev/newfeatures1253_rev11.htm – McNets Nov 22 '16 at 22:52
Actually its a different version of a sybase, even i cannot use order by clause in sub query. – Fazil Mir Nov 22 '16 at 22:55
it is complicated without any other ID: http://stackoverflow.com/a/15450023/3270427 – McNets Nov 22 '16 at 23:00
I tried it before but it doesn't work as well. Anyways thanks for your time. – Fazil Mir Nov 22 '16 at 23:02
are you able to extract the number after `Result`? – McNets Nov 22 '16 at 23:04
Sorry, I didn't quite catch it, can you elaborate it a bit? What number you're talking about? – Fazil Mir Nov 22 '16 at 23:08
if you can get `12012`from `Repeated:12012:CLEAR` then it's possible to simulate top 1 – McNets Nov 22 '16 at 23:09
I used str_replace function but then the situations got worse, because the data keep changing. Its not static. :( Sometimes the data is like `Duplicate:12122:ETA etc etc.` – Fazil Mir Nov 22 '16 at 23:11
ok, that's all I can do now, I've edited the answer, if you can use a view or get duplicateInfoNumber in any way, I think it should work – McNets Nov 22 '16 at 23:14
I will try it, anyways thanks for your time and concern. – Fazil Mir Nov 22 '16 at 23:15
Does '22983' in 'Repeated:22983:PENDING' relate to a column? – Keith John Hutchison Nov 22 '16 at 23:47
@KeithJohnHutchison yes. If somehow I can take it out from the string then I can achieve what I want. – Fazil Mir Nov 23 '16 at 00:00
Do you have string position, substring and cast / convert functions available to use within your version of sybase? – Keith John Hutchison Nov 23 '16 at 00:24
The key is to split the string on ':', get the second element, cast that to an integer, and then get the max of that grouped by RRID – Keith John Hutchison Nov 23 '16 at 00:26

Keith John Hutchison · Answer 2 · 2016-11-23T01:18:20.457

I don't have sybase to test this with. Here is an example from mysql to give you some pointers.

DROP TABLE IF EXISTS `duplicaterecords`;

CREATE TABLE `duplicaterecords` (
  `RRID` int(11) DEFAULT NULL,
  `duplicateInfo` varchar(200) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `duplicaterecords` (`RRID`, `duplicateInfo`)
VALUES
    (3920011,'Repeated:12009:CLEAR'),
    (3920011,'Repeated:12012:CLEAR'),
    (4233901,'Repeated:18129:HIT'),
    (4820129,'Repeated:22901:PENDING'),
    (4820129,'Repeated:22983:PENDING'),
    (4233901,'Duplicate:5555555:CLEAR');

    select grouped.*
, base.duplicateInfo 
from 
( 
    select grouped.RRID, max(grouped.duplicateInfoId) duplicateInfoId
    from (
        select RRID
        , cast(substring_index(substring_index(duplicateInfo,':',2 ),':',-1) as unsigned) duplicateInfoId
        , duplicateInfo 
        from duplicateRecords  
    ) grouped
    group by grouped.RRID
) grouped
inner join (
    select RRID
    , cast(substring_index(substring_index(duplicateInfo,':',2 ),':',-1) as unsigned) duplicateInfoId
    , duplicateInfo 
    from duplicateRecords  
) base
on grouped.duplicateInfoId = base.duplicateInfoId ;

-- example results

RRID    duplicateInfoId duplicateInfo
3920011 12012   Repeated:12012:CLEAR
4233901 5555555 Duplicate:5555555:CLEAR
4820129 22983   Repeated:22983:PENDING

score 0 · Answer 3 · answered Nov 24 '16 at 13:59

There is a simpler and more efficient way -- but only if you're running on Sybase ASE (doesn't work for Sybase IQ or Sybase SQL Anywhere). First, this is a 'duplicate key' problem, not a 'duplicate row' problem. The trick below will remove all rows with duplicate keys. But note that it is not defined which row to choose in case of duplicate keys -- so the first one is kept, the rest is discarded. So you should apply some ordering in the SELECT query in order to implement a different selection criterium

CREATE TABLE uniquetab (RRID ..., duplicateInfo ...) go CREATE UNIQUE INDEX ix on uniquetab(RRID) WITH IGNORE_DUP_KEY go

INSERT uniquetab SELECT * FROM duplicateRecords ORDER BY go

An alternative way is to BCP-out the duplicateRecords table, and then to BCP it into the uniquetab table.

How to ignore duplicate records in sybase?

3 Answers3