0

I want to write a query that will update duplicates per group:

  INPUT
+-------+-------+
| group | name  |
+-------+-------+
|     1 | name1 |
|     1 | name1 |
|     1 | name1 |
|     1 | name2 |
|     2 | name1 |
|     2 | name1 |
|     3 | name1 |
|     3 | name2 |
+-------+-------+

 OUTPUT  
+-------+----------------+
| group | name           |
+-------+----------------+
|     1 | name1          |
|     1 | name1 - Copy 1 |
|     1 | name1 - Copy 2 |
|     1 | name2          |
|     2 | name1          |
|     2 | name1 - Copy 1 |
|     3 | name1          |
|     3 | name2          |
+-------+----------------+

There is something like that here Renaming the duplicate data in sql but my problem is how to deal with groups. It is not so important how to name this duplicates but it will be cool if I can do it as specified in my example.

Illia
  • 43
  • 4
  • SQL is a unorderd set.. Without using ORDER BY clause MySQL is free to return the records in anny order. Is the order of renaming important is this case? If it is important you should alteast has a column where we can identify the natural order like a column with a primary key with auto_increment option or a datetime column with current_timestamp defualt option – Raymond Nijland Apr 04 '18 at 15:53
  • 1
    Order of output is not important. But I want all duplicates inside the same group to be renamed. – Illia Apr 04 '18 at 15:58
  • @RaymondNijland, yes I have primary id in my real example. – Illia Apr 04 '18 at 16:21

4 Answers4

2

If you have a primary key id column, then try this:

update (
    select `group`, name, min(id) as min_id
    from test
    group by `group`, name
) x
join test t using (`group`, name)
set t.name =
    case when t.id <> x.min_id
    then concat(t.name, ' - Copy ', t.id - x.min_id)
    else t.name
    end
;

Demo: http://rextester.com/AWEX77086

Here is another way, which is probably slower, but will guarantee consecutive copy numbers.

update (
    select l.id, count(*) as copy_nr
    from test l
    join test r
      on  r.group = l.group
      and r.name  = l.name
      and r.id < l.id
    group by l.id
) x
join test t using (id)
set t.name = concat(t.name, ' - Copy ', x.copy_nr);

Demo: http://rextester.com/NWSF57017

Paul Spiegel
  • 30,925
  • 5
  • 44
  • 53
1

Try the below, replace Groups, with Group.

Ok, I was pointed out that this is a MySQL question, so the below wont work for MySQL, but only for t-sql.

SELECT Groups,
    CASE WHEN Duplicate > 1
            THEN Name + ' - Copy ' + CONVERT(VARCHAR(10), Duplicate)
         ELSE Name
    END AS Name
FROM
(
    SELECT Groups,
        Name,
        ROW_NUMBER() OVER(PARTITION BY Name, Groups ORDER BY Name ) As Duplicate
    FROM TableName
) AS Data
ORDER BY Groups

enter image description here

mvisser
  • 652
  • 5
  • 11
1

This is untested, but you can similate the LAG function seen in many other RDBMS' with variables (more info here).

The idea is you store the group and name fields and compare against them before updating them.

SET @RowNumber = 0;
SET @PreviousGroup = NULL;
SET @PreviousName = NULL;

SELECT
  @PreviousGroup AS PreviousGroup,
  @PreviousName AS PreviousName,
  CASE 
    WHEN @PreviousGroup = `group` AND @PreviousName = `name` THEN (@RowNumber := @RowNumber + 1)
    ELSE @RowNumber := 0
  END AS `Counter`,
  CASE 
    WHEN @PreviousGroup = `group` AND @PreviousName = `name` THEN CONCAT(`name`,'- Copy ',@RowNumber)
    ELSE `name`
  END AS `Name`,
  @PreviousGroup := `group` AS RawGroup,
  @PreviousName := `name` AS RawName
FROM
  tbl1
ORDER BY
  `group` ASC,
  `name` ASC;
kchason
  • 2,836
  • 19
  • 25
1

You can do this using variables. I would recommend:

set @i = 0;
set @gn := '';

update t
    set name = concat_ws(' - Copy ', name,
                         nullif(if(@gn = concat_ws(':', group, name), @i := @i + 1,
                                   if(@gn := concat_ws(':', group, name), @i := 1, @i := 1)
                                  ), 0)
    order by t.group, name;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786