1

I'm trying to remove personal information (strings) from a database, ie. usernames, addresses, etc for testing purposes. Then I'd like to use the string that was removed to search back through the DB to see if it occurs elsewhere. I'm using SQL Server Management Studio.

For a table MyTable, and column PersonalInfo, a possible entry might look like:

Username=JohnSmith,Country=US,InvoiceNumber=12345

I have another table with a column Redacted that looks like

Username=*,Country=*,InvoiceNumber=*

Replacing is easy, using replace(PersonalInfo, Replace(Redacted,'*','%'), Redacted);. However, is there a way to return JohnSmith and other data in a temporary table? Basically I'd like to identify what is replaced by % in a regular expression.

I've found ways to do this in other languages, such as java(StackOverflow) but I would like to stay within SQL if possible.

Now, I realize it would be better if Username,Country, and InvoiceNumber were separate columns and not just one long string, but the database was given to me as is and changing the structure is not an option. Any advice? Thanks in advance. This is my first SO question, so let me know if I can format it better.

Community
  • 1
  • 1
Smurfy7777
  • 13
  • 4
  • You can use an [`OUTPUT`](https://msdn.microsoft.com/en-us/library/ms177564.aspx) clause to get any data from the rows (Note plural.), e.g. identity column values for newly inserted rows. `OUTPUT` can be used with `INSERT`, `UPDATE`, `DELETE` and `MERGE` and provides access to both before and after values in the case of `UPDATE`. A tool well worth having in your pocket. – HABO Jun 09 '15 at 15:45

1 Answers1

0

You could try the output keyword, that won't tell you exactly what was taken out but it will tell you the state of the record before you changed it. And yes, it would work better for you if the information was properly normalized.

UPDATE table
SET a = replace.....
OUTPUT DELETED.*
WHERE stuff

See docs...https://msdn.microsoft.com/en-us/library/ms177564.aspx

saarrrr
  • 2,754
  • 1
  • 16
  • 26
  • You can output both `inserted` and `deleted` from an `update` and compare the before and after values later to determine what was removed. – HABO Jun 09 '15 at 15:56
  • Thanks to both of you. I think `OUTPUT` will work for my purposes with a little tweaking. I'll try it out and accept the answer if it works. – Smurfy7777 Jun 09 '15 at 16:17