6

Here is my testing table data:

Testing

ID      Name            Payment_Date   Fee                Amt
1       BankA           2016-04-01     100                20000
2       BankB           2016-04-02     200                10000
3       BankA           2016-04-03     100                20000
4       BankB           2016-04-04     300                20000

I am trying to compare fields Name, Fee and Amt of each data records to see whether there are the same values or not. If they got the same value, I'd like to mark something like 'Y' to those record. Here is the expected result

ID      Name            Payment_Date   Fee                Amt      SameDataExistYN
1       BankA           2016-04-01     100                20000    Y
2       BankB           2016-04-02     200                10000    N
3       BankA           2016-04-03     100                20000    Y
4       BankB           2016-04-04     300                20000    N

I have tried these two methods below. but I am looking for any other solutions so I can pick out the best one for my work.

Method 1.

select t.*, iif((select count(*) from testing where name=t.name and fee=t.fee and amt=t.amt)=1,'N','Y') as SameDataExistYN from testing t

Method 2.

select t.*, case when ((b.Name = t.Name)
                        and (b.Fee = t.Fee) and (b.Amt = t.Amt)) then 'Y' else 'N' end as SameDataExistYN
from testing t
left join ( select Name,  Fee, Amt
            from testing
            Group By Name,  Fee, Amt
            Having count(*)>1  ) as b on b.Name = t.Name
                                      and b.Fee = t.Fee
                                      and b.Amt = t.Amt
Chhun Panharath
  • 136
  • 1
  • 10

4 Answers4

3

There are several approaches, with differences in performance characteristics.

One option is to run a correlated subquery. This approach is best suited if you have a suitable index, and you are pulling a relatively small number of rows.

SELECT t.id
     , t.name
     , t.payment_date
     , t.fee
     , t.amt
     , ( SELECT 'Y' 
           FROM testing s
          WHERE s.name = t.name
            AND s.fee  = t.fee
            AND s.amt  = t.amt
            AND s.id  <> t.id
          LIMIT 1
        ) AS SameDataExist
  FROM testing t
 WHERE ...
 LIMIT ...

The correlated subquery in the SELECT list will return a Y when there is at least one "matching" row found. If no "matching" row is found, SameDataExist column will have a value of NULL. To convert the NULL to an 'N', you could wrap the subquery in an IFULL() function.


Your method 2 is a workable approach. The expression in the SELECT list doesn't need to do all those comparisons, those have already been done in the join predicates. All you need to know is whether a matching row was found... just testing one of the columns for NULL/NOT NULL is sufficient.

SELECT t.id
     , t.name
     , t.payment_date
     , t.fee
     , t.amt
     , IF(s.name IS NOT NULL,'Y','N') AS SameDataExists
  FROM testing t
  LEFT
  JOIN ( -- tuples that occur in more than one row
         SELECT r.name, r.fee, r.amt
           FROM testing r
          GROUP BY r.name, r.fee, r.amt
         HAVING COUNT(1) > 1
       ) s
    ON s.name = t.name
   AND s.fee  = t.fee
   AND s.amt  = t.amt
 WHERE ...

You could also make use of an EXISTS (correlated subquery)

spencer7593
  • 106,611
  • 15
  • 112
  • 140
2

Check this out

Select statement to find duplicates on certain fields

Not sure how to mark this as a dupe...

Community
  • 1
  • 1
karina
  • 805
  • 5
  • 10
1

Here is another method, but I think you have to run tests on your data to find out which is best:

SELECT
  t.*,
  CASE WHEN EXISTS(
    SELECT * FROM testing WHERE id <> t.id AND Name = t.Name AND Fee = t.Fee AND Amt = t.Amt
  ) THEN 'Y' ELSE 'N' END SameDataExistYN
FROM
  testing t 
;
maraca
  • 8,468
  • 3
  • 23
  • 45
1

Select t.name ,t.fee,t.amt,if(count(*)>1),'Y','N') from testing t group by t.name,t.fee,t.amt

Rasaiya
  • 11
  • 2