2

I want to get a set of records where only the date (YYYY-MM-DD) and the username pair match in multiple records, all the other fields can be different.

SELECT *, count(*) from table 
GROUP BY username 
HAVING count(username)>1 AND count(date)>1;

Appears to be severely wrong, and I'm not sure how to make it work. Most of the help I see here is about entire rows that match, I am only concerned about records that match on these two columns. I want to flag these duplicates.

I have no control over this database, so I can't take future maintenance suggestions.

Kiwizoom
  • 433
  • 1
  • 8
  • 21

1 Answers1

4
create table table1
(
    id int auto_increment primary key,
    username varchar(30) not null,
    `date` date not null
);

insert table1 (username,`date`) values ('john','2015-01-01');
insert table1 (username,`date`) values ('kim','2015-01-01');
insert table1 (username,`date`) values ('john','2015-01-01');
insert table1 (username,`date`) values ('john','2015-02-01');
insert table1 (username,`date`) values ('john','2015-03-01');

SELECT t1.* 
from table1 t1
join
(
    select username,`date`,count(*)
    from table1
    group by username,`date`
    having count(username)>1
) inr
on inr.username=t1.username and inr.`date`=t1.`date`

results in 2 rows shown

+----+----------+------------+
| id | username | date       |
+----+----------+------------+
|  1 | john     | 2015-01-01 |
|  3 | john     | 2015-01-01 |
+----+----------+------------+
2 rows in set (0.03 sec)

Edit:

as per OP request, have a column to flag dupes for later work, as opposed to a select statement. Note you can Alter Table and add this nullable flag column, set it, use values at your leisure, later Alter Table and drop it.

But I will just start over here with the create table with new flag column:

create table table1
(
    id int auto_increment primary key,
    username varchar(30) not null,
    `date` date not null,
    dupeflag int null --    <---- New flag column, nullable, ignored on inserts below
);

insert table1 (username,`date`) values ('john','2015-01-01');
insert table1 (username,`date`) values ('kim','2015-01-01');
insert table1 (username,`date`) values ('john','2015-01-01');
insert table1 (username,`date`) values ('john','2015-02-01');
insert table1 (username,`date`) values ('john','2015-03-01');

update table1 t1
join 
(   select username,`date`,count(*)
    from table1
    group by username,`date`
    having count(username)>1
) inr
on inr.username=t1.username and inr.`date`=t1.`date`
set dupeflag=1;

-- 2 rows affected

select * from table1;

+----+----------+------------+----------+
| id | username | date       | dupeflag |
+----+----------+------------+----------+
|  1 | john     | 2015-01-01 |        1 |
|  2 | kim      | 2015-01-01 |     NULL |
|  3 | john     | 2015-01-01 |        1 |
|  4 | john     | 2015-02-01 |     NULL |
|  5 | john     | 2015-03-01 |     NULL |
+----+----------+------------+----------+
5 rows in set (0.00 sec)
Drew
  • 24,851
  • 10
  • 43
  • 78
  • I'm confused about there being insert statements... I've not done one like this. A temp table is created, the same values from my table are inserted, and mirror tables are compared with each other? I'm kind of lost – Kiwizoom Aug 07 '15 at 17:12
  • yeah, I was just showing you that I wasn't blowing smoke Kiwi Z, just a demo – Drew Aug 07 '15 at 17:14
  • we're a fun lot K, don't mean no harm – Drew Aug 07 '15 at 17:16
  • Thanks, this works. Is it possible to then change this to set a flag column = 1 on the original table for all the records this returns? – Kiwizoom Aug 07 '15 at 17:30
  • totally yes, do you want to share your table so it works out of the gates? Meaning, let's work off your table, not my pretend **table1** above. Or should I just do an edit and show it in my pretend world? – Drew Aug 07 '15 at 17:32
  • you could do an `alter table` and create a nullable flag column. set it, do your edits, compactions of dupes, then `alter table` and drop the flag column. *Should* have no impact on your app running. – Drew Aug 07 '15 at 17:35
  • It's a client .csv table export I dumped into a different table, so it is sensitive data but no harm can come to the loaded table. Whoa nelly, I'm pretty inexperienced at layers of mysql statements. – Kiwizoom Aug 07 '15 at 17:38
  • I have an extra column for flagging, but I get lost when things start using multiple tables and joining – Kiwizoom Aug 07 '15 at 17:39
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/85466/discussion-between-drew-pierce-and-kiwizoom). – Drew Aug 07 '15 at 17:40
  • Thank you very much for this! – Kiwizoom Aug 07 '15 at 18:00