0

Possible Duplicate:
Renaming the duplicate data in sql

I have to clean up a table field (article_title), creating a new field where duplicate titles are changed as such:

 id  |      title         |     new_title
 34  | My Duplicate Title | My Duplicate Title
 ...
 95  | My Duplicate Title | My Duplicate Title (2)

I used this mysql query, but it doesn't work (all ranks are 1):

 SET @rank := 0;

 SET @prev := NULL;

 SELECT @rank := IF(@prev = title, @rank + 1, 1) AS rank,
   id, IF(@rank>1, Concat(title, ' (', @rank, ')'), title), @prev := title
 FROM articles ORDER BY title ASC

What I'm doing wrong?

Community
  • 1
  • 1
Giuppe
  • 434
  • 3
  • 7
  • 1
    Possible duplicates: [Renaming the duplicate data in sql](http://stackoverflow.com/questions/7416545/renaming-the-duplicate-data-in-sql) & [How do I rename duplicates in MySQL](http://stackoverflow.com/questions/7621460/how-do-i-rename-duplicates-in-mysql-using-php-or-just-a-mysql) – Kermit Sep 10 '12 at 18:07
  • If you have a unique ID (or another unique field) then you can set the number in brackets (2), (3) and so on as the number of the records with the same title but a smaller ID: – Alex Monthy Sep 10 '12 at 18:54
  • The second possible duplicate contains a (lenghty) solution based on stored procedures. However, I would like to know why this query couldn't be made using variables only and/or if I missed something in my solution... – Giuppe Sep 11 '12 at 08:28

2 Answers2

0

I think this is one of those times when you're MUCH better off just talking to the DB from Python or PHP or whatever and implementing your logic in a traditional programming language rather than SQL.

A hash table/dictionary of previously seen titles (I'd just using the original title as the key, number of times seen as the value) will make this easy.

Python pseudo-implementation:

# c and c2 are db.cursor() object
c.execute('select id, title from books')
seen = {}
for b in c:
  if seen.has_key(b[1]):
    seen[b[1]] += 1
    c.execute('update books set new_title = %s where id = %d',(b[1]+' ('+str(seen[b[1]])+')'))
  else:
    seen[b[1]] = 1

Will need to sprinkle in some DB connection code, maybe some transaction commits, etc. But this will give you the idea.

Tyler Eaves
  • 12,879
  • 1
  • 32
  • 39
  • -1 The OP has not specified any server-side language. I feel the answer should remain with the db while suggesting a server-side alternative. – Kermit Sep 10 '12 at 18:08
  • Right tool for the job. I don't think it's deserving a downvote. I suggested a workable solution, and even provided code. I mean, you could right a CMS in pure SQL with triggers too, probably, but why on earth would you want to? – Tyler Eaves Sep 10 '12 at 18:11
  • I thought of a python script, but I can only access this remote db through phpMyAdmin, and I will need to execute this query regularly. I'm trying SQL first, because with python I have to duplicate the table in a local db, execute script, and then upload the changes. – Giuppe Sep 10 '12 at 18:29
  • While it adds an additional degree of difficulty, you could do a csv export out of PhpMyAdmin, and then have the python script read that line by line, and instead of having it actually run update queries, have it write a text file of the necessary updates which you can then run as a bulk load in PhpMyAdmin. – Tyler Eaves Sep 10 '12 at 18:34
0
drop table if exists t;
drop table if exists t2;
create table t( id integer ,title varchar(50), new_title varchar(50));
insert into t (id, title) values (34, 'My Duplicate Title'), (95, 'My Duplicate Title');
create temporary table t2 (id integer, new_title varchar(50));
insert into t2 (id, new_title)
select id, concat( title, 
  ' (', 
    (select count(*) from t as tt where tt.title = t.title and tt.id > t.id), 
    ')' 
) from t;
update t, t2
set t.new_title = t2.new_title where t.id=t2.id;
Alex Monthy
  • 1,827
  • 1
  • 14
  • 24
  • This only works when I insert manually the various duplicate subets... or am I missing something? – Giuppe Sep 11 '12 at 08:36
  • No, this shows a test case. It works with duplicate titles, starting with "create temporary table..." Replace "t" with "article_title" as the table name and apply it to your case. – Alex Monthy Sep 11 '12 at 09:51
  • How do I insert automatically duplicate titles into the temporary table? – Giuppe Sep 11 '12 at 11:35
  • That's what the INSERT statement does. What do you mean with "automatically"? – Alex Monthy Sep 11 '12 at 11:59
  • I need to select (and rank) the duplicates first. I need to know which one are duplicates and "rank" them using their incremental id (for example the first duplicate has rank 1, the second 2 and so on). If I had this I could use your INSERT statement. – Giuppe Sep 11 '12 at 14:04