247

I have a table named PAYMENT. Within this table I have a user ID, an account number, a ZIP code and a date. I would like to find all records for all users that have more than one payment per day with the same account number.

UPDATE: Additionally, there should be a filter than only counts the records whose ZIP code is different.

This is how the table looks like:

| user_id | account_no | zip   |      date |
|       1 |        123 | 55555 | 12-DEC-09 | 
|       1 |        123 | 66666 | 12-DEC-09 |
|       1 |        123 | 55555 | 13-DEC-09 |
|       2 |        456 | 77777 | 14-DEC-09 |
|       2 |        456 | 77777 | 14-DEC-09 |
|       2 |        789 | 77777 | 14-DEC-09 |
|       2 |        789 | 77777 | 14-DEC-09 |

The result should look similar to this:

| user_id | count |
|       1 |     2 |

How would you express this in a SQL query? I was thinking self join but for some reason my count is wrong.

Benjamin Muschko
  • 32,442
  • 9
  • 61
  • 82

4 Answers4

465

Use the HAVING clause and GROUP By the fields that make the row unique

The below will find

all users that have more than one payment per day with the same account number

SELECT 
 user_id,
 COUNT(*) count
FROM 
 PAYMENT
GROUP BY
 account,
 user_id,
 date
HAVING COUNT(*) > 1

Update If you want to only include those that have a distinct ZIP you can get a distinct set first and then perform you HAVING/GROUP BY

 SELECT 
    user_id,
    account_no, 
    date,
    COUNT(*)
 FROM
    (SELECT DISTINCT
            user_id,
            account_no, 
            zip, 
            date
         FROM
            payment 
    ) payment
 GROUP BY
    user_id,
    account_no,
    date
 HAVING COUNT(*) > 1
Micah Elliott
  • 9,600
  • 5
  • 51
  • 54
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • 1
    Notice in his results `2` has a count of `4` - You will want to discard the `Account_no` grouping I think. – JNK Aug 22 '11 at 17:54
  • No wait I think the original was right "all users that have more than one payment per day with the same account number." – Conrad Frix Aug 22 '11 at 17:57
  • it says that but his results show otherwise. Maybe have both versions with a note. – JNK Aug 22 '11 at 17:58
  • Thanks for your responses. I think that should do it. If I now wanted to add another filter that checks if the billing ZIP code (same table, different column) is different for the same date how would I modify this query? – Benjamin Muschko Aug 22 '11 at 18:06
  • I can't resolve the sample output. If we drop the Account, we'd get three rows. If we drop both date and account we'd get two rows 1,3 and 2,4. So I'm going to go ahead and trust the words over the output – Conrad Frix Aug 22 '11 at 18:08
  • @Benjamin. Add Zip to the group by clause – Conrad Frix Aug 22 '11 at 18:09
  • @Conrad Fix: I actually only want to count if the ZIP code is different. If I add it to the group by it would give me a count of 2 if there are two records with the same ZIP code. – Benjamin Muschko Aug 22 '11 at 18:15
  • @Benjamin can you update your question with the sample data and the output you want – Conrad Frix Aug 22 '11 at 18:25
  • @Conrad Frix: Updated my question. – Benjamin Muschko Aug 22 '11 at 18:30
64

Try this query:

SELECT column_name
  FROM table_name
 GROUP BY column_name
HAVING COUNT(column_name) = 1;
Dmytro Plekhotkin
  • 1,965
  • 2
  • 23
  • 47
user4019456
  • 641
  • 5
  • 2
6

I wouldn't recommend the HAVING keyword for newbies, it is essentially for legacy purposes.

I am not clear on what is the key for this table (is it fully normalized, I wonder?), consequently I find it difficult to follow your specification:

I would like to find all records for all users that have more than one payment per day with the same account number... Additionally, there should be a filter than only counts the records whose ZIP code is different.

So I've taken a literal interpretation.

The following is more verbose but could be easier to understand and therefore maintain (I've used a CTE for the table PAYMENT_TALLIES but it could be a VIEW:

WITH PAYMENT_TALLIES (user_id, zip, tally)
     AS
     (
      SELECT user_id, zip, COUNT(*) AS tally
        FROM PAYMENT
       GROUP 
          BY user_id, zip
     )
SELECT DISTINCT *
  FROM PAYMENT AS P
 WHERE EXISTS (
               SELECT * 
                 FROM PAYMENT_TALLIES AS PT
                WHERE P.user_id = PT.user_id
                      AND PT.tally > 1
              );
onedaywhen
  • 55,269
  • 12
  • 100
  • 138
  • 2
    A CTE/subquery approach is definitely worth using when readaility/maintainability is the main concern. I disagree, though, that `HAVING` is 'legacy' - it is still widely used 10 years on, and is a very useful shorthand to avoid the need for the verbosity in your example. – rjh May 16 '21 at 12:19
  • HAVING seems more readable to me, WHERE EXISTS might be faster... – Joey Nov 26 '21 at 04:43
3
create table payment(
    user_id int(11),
    account int(11) not null,
    zip int(11) not null,
    dt date not null
);

insert into payment values
(1,123,55555,'2009-12-12'),
(1,123,66666,'2009-12-12'),
(1,123,77777,'2009-12-13'),
(2,456,77777,'2009-12-14'),
(2,456,77777,'2009-12-14'),
(2,789,77777,'2009-12-14'),
(2,789,77777,'2009-12-14');

select foo.user_id, foo.cnt from
(select user_id,count(account) as cnt, dt from payment group by account, dt) foo
where foo.cnt > 1;
iryndin
  • 530
  • 1
  • 5
  • 11