2

I am using XDocument to keep a sort of database. This database consists of registered chatterbots, and I simply have many "bot" nodes with attributes such as "username", "owner", and such. However, occasionally some smart guy decides to make a bot with a very strange character as one of the properties. This makes the XDocument class series throw an exception whenever that node is read, a very large problem because the database fails to save completely as it stops writing to the file as soon as it hits the invalid character.

My question is this- Is there a simple method that is something like XSomething.IsValidString(string s), so I can just omit the offending data? My database is not the official one, just a personal use, so it is not imperative that I include the bad data.

Some code that I am using (the variable file is the XDocument):
To save:
file.Save(Path.Combine(Environment.CurrentDirectory, "bots.xml"));

To load (after checking if File.Exists() etc etc):
file = XDocument.Load(Path.Combine(Environment.CurrentDirectory, "bots.xml"));

To add to the database (variables are all strings):

            file.Root.Add(new XElement("bot",
                new XAttribute("username", botusername),
                new XAttribute("type", type),
                new XAttribute("botversion", botversion),
                new XAttribute("bdsversion", bdsversion),
                new XAttribute("owner", owner),
                new XAttribute("trigger", trigger)));

Pardon my lack of proper XML techniques, I'm just starting. What I'm asking is if there is a XSomething.IsValidString(string s) method, not how terrible my XML is.

Ok, I just got the exception again, here is the exact message and stack trace.

System.ArgumentException: '', hexadecimal value 0x07, is an invalid character.
at System.Xml.XmlUtf8RawTextWriter.InvalidXmlChar(Int32 ch, Byte* pDst, Boolean entitize)
at System.Xml.XmlUtf8RawTextWriter.WriteAttributeTextBlock(Char* pSrc, Char* pSrcEnd)
at System.Xml.XmlUtf8RawTextWriter.WriteString(String text)
at System.Xml.XmlUtf8RawTextWriterIndent.WriteString(String text)
at System.Xml.XmlWellFormedWriter.WriteString(String text)
at System.Xml.XmlWriter.WriteAttributeString(String prefix, String localName, String ns, String value)
at System.Xml.Linq.ElementWriter.WriteStartElement(XElement e)
at System.Xml.Linq.ElementWriter.WriteElement(XElement e)
at System.Xml.Linq.XElement.WriteTo(XmlWriter writer)
at System.Xml.Linq.XContainer.WriteContentTo(XmlWriter writer)
at System.Xml.Linq.XDocument.WriteTo(XmlWriter writer)
at System.Xml.Linq.XDocument.Save(String fileName, SaveOptions options)
at System.Xml.Linq.XDocument.Save(String fileName)
at /* my code stack trace omitted */
Cœur
  • 37,241
  • 25
  • 195
  • 267

3 Answers3

3

Try changing the file.Save line for the following code:

XmlWriterSettings settings = new XmlWriterSettings();
settings.CheckCharacters = false;
XmlWriter writer = XmlWriter.Create(Path.Combine(Environment.CurrentDirectory, "bots.xml"), settings);
file.Save(writer);

source: http://sartorialsolutions.wordpress.com/page/2/

  • This looks like it will work, thanks. The offending bot is not online right now, so I can't test it, but thanks a lot. It looks like 0x07 is the only invalid character. If this is true, I may do a string.Replace("\a", "");, but your solution is better for those who need to store the \a. –  Apr 07 '12 at 18:54
0

First can you check whether your XML file is saved with proper encoding? I normally save xml file as UTF8 and You can declare encoding in your xml header

<?xml version="1.0" encoding="UTF-8"?>

Of course the body of your xml must conforming xml standard. Here is a good article about it

http://weblogs.sqlteam.com/mladenp/archive/2008/10/21/Different-ways-how-to-escape-an-XML-string-in-C.aspx

Cloud Xu
  • 3,267
  • 2
  • 15
  • 14
  • Yep, my header is exactly that line, except its "utf-8" instead of "UTF-8". Shoudln't make a difference, though (I think). I looked up the hex value 0x07, and its apparently the "motherboard beep", I think `\a` in code. I could just check for this character, there may be more invalid characters that I don't know about. –  Apr 07 '12 at 18:41
0

From .NET 4, you can use XmlConvert.VerifyXmlChars(string content). This will throw an exception if the string passed is not accepted.

Aristoteles
  • 708
  • 2
  • 7
  • 15