2

Some text fields in my database have bad control characters embedded. I only noticed this when trying to serialize an object and get an xml error on char  and . There are probably others.

How do I replace them using C#? I thought something like this would work:

text.Replace('\x2', ' ');

but it doesn't. Any help appreciated.

Graeme
  • 2,597
  • 8
  • 37
  • 50

3 Answers3

7

Strings are immutable - you need to reassign:

text = text.Replace('\x2', ' ');
BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
2

exactly as was said above, strings are immutable in C#. This means that the statement:

text.Replace('\x2', ' '); 

returned the string you wanted,but didn't change the string you gave it. Since you didn't assign the return value anywhere, it was lost. That's why the statement above should fix the problem:

text = text.Replace('\x2', ' '); 

If you have a string that you are frequently making changes to, you might look at the StringBuilder object, which works very much like regular strings, but they are mutable, and therefore much more efficient in some situatations.

Good luck!

-Craig

Fredrik Leijon
  • 2,792
  • 18
  • 20
Kreg
  • 647
  • 1
  • 6
  • 17
1

The larger problem you're dealing with is the XmlSerialization round trip problem. You start with a string, you serialize it to xml, and then you deserialize the xml to a string. One expects that this always results in a string that is equivalent to the first string, but if the string contains control characters, the deserialization throws an exception.

You can fix that by passing an XmlTextReader instead of a StreamReader to the Deserialize method. Set the XmlTextReader's Normalization property to false.

You should also be able to solve this problem by serializing the string as CDATA; see How do you serialize a string as CDATA using XmlSerializer? for more information.

Community
  • 1
  • 1
phoog
  • 42,068
  • 6
  • 79
  • 117