2

I've inherited a project that somehow ended up with random rows in the application settings table getting duplicated. I got the duplicate rows removed successfully, but then I noticed that the actual values, of type nvarchar, for said rows are also duplicated.

For instance, one row has Key column Error Email Address and Value column websupport@mycompany.com,websupport@mycompany.com. The Value column should just contain websupport@mycompany.com. There are numerous records like this, all following the same pattern of The value,The value.

How can I detect when the Value column contains this kind of duplicated data and correct it?

Note that the comma alone is not enough to say the row is invalid, because things like Key Default Error Message Value Oops, something went wrong are correct and also contain a comma.

Blake Hood
  • 314
  • 3
  • 12
  • You need to define better when a value is allowed to contain a comma and when it isn't. You can certainly write a constraint that traps this, but it is almost certainly based on the code that is calling the insert or the stored procedure that performs the insert. How hard would that code be to check / debug? – Aaron Bertrand Sep 03 '13 at 18:49
  • It appears that only the error message is supposed to have a comma as part of the value, and everything else is storing a comma-delimited list of numbers/strings. The error email example above results in us receiving two identical emails whenever there is an error, and there are valid cases for this such as when two separate people need to be notified of something. Although of course storing a list of values in a column is bad, that's just what I have to work with. – Blake Hood Sep 03 '13 at 19:00

4 Answers4

3

Here is an update that would solve the problem:

update t
    set value = left(value, len(val)/2)
    where left(value, len(val)/2) = right(value, len(val)/2) and
          substring(value, (len(val)/2) + 1, 1) = ',';

You can validate the logic by doing select first:

select value
from t
where left(value, len(val)/2) = right(value, len(val)/2) and
      substring(value, (len(val)/2) + 1, 1) = ',';
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Here is the query:

Update TableName
set Col1=substring(Col1,0,charindex(',',Col1))
where substring(Col1,0,charindex(',',Col1))=substring(Col1,charindex(',',Col1)+1,500)

Replace Col1 with the column name, and TableName with the actual table name. It will only replace the rows where value before and after comma(,) are same with a single value.

Sonam
  • 3,406
  • 1
  • 12
  • 24
0

select * from blah where email like '%@%,%@%'

It does seem like you have other issues to deal with regarding how the bad data is being inserted.

Grax32
  • 3,986
  • 1
  • 17
  • 32
0

I would try something like https://stackoverflow.com/a/5123680/1504882 Where you would split the column on the column and when Left = Right update the column to (whatever you choose, left/right)

Community
  • 1
  • 1
Elias
  • 2,602
  • 5
  • 28
  • 57