1

I have multiple tables with corrupted data like this:

Good data</title>and some random HTML<p></p><

What I would like to do is only keep "Good data" in the cell and remove everything else. I thought I can just create a small project with EF and clean the db with C# but there might be a quicker solution with SQL only? Can you use regex or some kind of substring function in SQL?

I will manually look at the table and select the field that needs to run through the code, there is no need to automate that at this point.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
bobek
  • 8,003
  • 8
  • 39
  • 75

2 Answers2

4
UPDATE dbo.SQLInjectionVictimTableName
  SET UnprotectedColumn = LEFT(UnprotectedColumn, CHARINDEX('<', UnprotectedColumn) - 1)
  WHERE UnprotectedColumn LIKE '%<%';
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
0

If good data is consistently followed by </, you could use:

UPDATE YourTable
SET BadField = LEFT(Badfield,CHARINDEX('</',BadField)-1)
WHERE CHARINDEX('</',Badfield) > 0
Hart CO
  • 34,064
  • 6
  • 48
  • 63