0

Our end users still copy and paste things from Word and Excel into form fields and we end up with a lot of unwanted characters in our database tables. I've tried a bunch of things to remove unwanted characters from strings. The latest is a character like the following enter image description here

I have tried the following to no avail:

summary = Regex.Replace(summary, @"[^\u0000-\u007F]+", string.Empty);

summary = Encoding.ASCII.GetString(Encoding.ASCII.GetBytes(summary));

Does saving it to the database somehow change it's value?!?!

This does find the offending string in the DB

select *
from Project
where CharIndex(CHAR(2), summary) > 0

The server error that gets thrown is this:

System.ArgumentException: '', hexadecimal value 0x02, is an invalid character

which is why I tried the Regex solution first (\u0002 seems to be the offending character as far as C# is concerned)

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
DanTheMan1966
  • 45
  • 1
  • 11
  • That's not how you would typecast in SQL. Try the `CAST` function instead. – 500 - Internal Server Error Feb 17 '23 at 19:00
  • That's just to find the projects with offending summaries. And it does work. – DanTheMan1966 Feb 17 '23 at 19:02
  • Round-tripping through ASCII would do nothing; STX is a valid ASCII control code. Your regex similarly does nothing because it doesn't exclude code point 2. The error you get is *not* a SQL Server error but an error produced by whatever client you're using, which shows the string *is* returned, with the STX intact. Try `CONVERT(VARBINARY(MAX), ))` to see the contents and `REPLACE(, CHAR(2) COLLATE Latin1_General_BIN2, '')` to reliably remove it (as non-binary collations mostly ignore control characters). Try `[\u0000-\u001f]` on the C# end to match control characters instead. – Jeroen Mostert Feb 17 '23 at 19:17
  • 1
    You could have an issue with a character encoding mismatch. https://stackoverflow.com/questions/1929812/how-does-cut-and-paste-affect-character-encoding-and-what-can-go-wrong – Eric J. Feb 18 '23 at 00:15

0 Answers0