Renaming duplicates per group in sql

Question

I want to write a query that will update duplicates per group:

  INPUT
+-------+-------+
| group | name  |
+-------+-------+
|     1 | name1 |
|     1 | name1 |
|     1 | name1 |
|     1 | name2 |
|     2 | name1 |
|     2 | name1 |
|     3 | name1 |
|     3 | name2 |
+-------+-------+

 OUTPUT  
+-------+----------------+
| group | name           |
+-------+----------------+
|     1 | name1          |
|     1 | name1 - Copy 1 |
|     1 | name1 - Copy 2 |
|     1 | name2          |
|     2 | name1          |
|     2 | name1 - Copy 1 |
|     3 | name1          |
|     3 | name2          |
+-------+----------------+

There is something like that here Renaming the duplicate data in sql but my problem is how to deal with groups. It is not so important how to name this duplicates but it will be cool if I can do it as specified in my example.

SQL is a unorderd set.. Without using ORDER BY clause MySQL is free to return the records in anny order. Is the order of renaming important is this case? If it is important you should alteast has a column where we can identify the natural order like a column with a primary key with auto_increment option or a datetime column with current_timestamp defualt option — Raymond Nijland, Apr 04 '18 at 15:53
Order of output is not important. But I want all duplicates inside the same group to be renamed. — Illia, Apr 04 '18 at 15:58

Paul Spiegel · Accepted Answer · 2018-04-04T17:16:44.287

2

If you have a primary key id column, then try this:

update (
    select `group`, name, min(id) as min_id
    from test
    group by `group`, name
) x
join test t using (`group`, name)
set t.name =
    case when t.id <> x.min_id
    then concat(t.name, ' - Copy ', t.id - x.min_id)
    else t.name
    end
;

Demo: http://rextester.com/AWEX77086

Here is another way, which is probably slower, but will guarantee consecutive copy numbers.

update (
    select l.id, count(*) as copy_nr
    from test l
    join test r
      on  r.group = l.group
      and r.name  = l.name
      and r.id < l.id
    group by l.id
) x
join test t using (id)
set t.name = concat(t.name, ' - Copy ', x.copy_nr);

Demo: http://rextester.com/NWSF57017

edited Apr 04 '18 at 17:16

answered Apr 04 '18 at 16:43

Paul Spiegel

30,925
5
44
53

Nice find but this answer would require ascending id's in a group without holes in the id's – Raymond Nijland Apr 04 '18 at 17:03
@RaymondNijland This is true but: "It is not so important how to name this duplicates". – Paul Spiegel Apr 04 '18 at 17:05
Thanks! It is simple and helped me a lot. – Illia Apr 04 '18 at 22:11

mvisser · Answer 2 · 2018-04-04T16:01:49.800

1

Try the below, replace Groups, with Group.

Ok, I was pointed out that this is a MySQL question, so the below wont work for MySQL, but only for t-sql.

SELECT Groups,
    CASE WHEN Duplicate > 1
            THEN Name + ' - Copy ' + CONVERT(VARCHAR(10), Duplicate)
         ELSE Name
    END AS Name
FROM
(
    SELECT Groups,
        Name,
        ROW_NUMBER() OVER(PARTITION BY Name, Groups ORDER BY Name ) As Duplicate
    FROM TableName
) AS Data
ORDER BY Groups

edited Apr 04 '18 at 16:01

answered Apr 04 '18 at 15:55

mvisser

652
5
11

You're assuming MySQL 8.x – McNets Apr 04 '18 at 15:57
2

Question is tagged MySQL. MySQL doesn't support window functions until MySQL 8.0+ which is not production ready jet. So i doubt the topicstarter is using MySQL 8.0+ – Raymond Nijland Apr 04 '18 at 15:57
Oh yes very true. – mvisser Apr 04 '18 at 15:58
select version returns 5.6.33 – Illia Apr 04 '18 at 16:14

score 1 · Answer 3 · answered Apr 04 '18 at 16:21

This is untested, but you can similate the LAG function seen in many other RDBMS' with variables (more info here).

The idea is you store the group and name fields and compare against them before updating them.

SET @RowNumber = 0;
SET @PreviousGroup = NULL;
SET @PreviousName = NULL;

SELECT
  @PreviousGroup AS PreviousGroup,
  @PreviousName AS PreviousName,
  CASE 
    WHEN @PreviousGroup = `group` AND @PreviousName = `name` THEN (@RowNumber := @RowNumber + 1)
    ELSE @RowNumber := 0
  END AS `Counter`,
  CASE 
    WHEN @PreviousGroup = `group` AND @PreviousName = `name` THEN CONCAT(`name`,'- Copy ',@RowNumber)
    ELSE `name`
  END AS `Name`,
  @PreviousGroup := `group` AS RawGroup,
  @PreviousName := `name` AS RawName
FROM
  tbl1
ORDER BY
  `group` ASC,
  `name` ASC;

score 1 · Answer 4 · answered Apr 04 '18 at 17:11

You can do this using variables. I would recommend:

set @i = 0;
set @gn := '';

update t
    set name = concat_ws(' - Copy ', name,
                         nullif(if(@gn = concat_ws(':', group, name), @i := @i + 1,
                                   if(@gn := concat_ws(':', group, name), @i := 1, @i := 1)
                                  ), 0)
    order by t.group, name;

Renaming duplicates per group in sql

4 Answers4