2

i am going to be storing some text in a database. The text happens to be xml.

i'm only storing and reading the "blob" of text (i am not using any of the xml querying or indexing facilities).

Is there any advantage to declaring the column as xml:

CREATE TABLE docs (pk INT PRIMARY KEY, xCol XML not null)

rather than nvarchar(max):

CREATE TABLE docs (pk INT PRIMARY KEY, xCol NVARCHAR(max) not null)    

i figure that if i give SQL Server the hint that the text is actually xml, then it can apply compression for more efficient storage.

Note: The third option is for me to compress the text client-side and store the data in a varbinary(max) blob column.

Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
  • Possible dupe: http://stackoverflow.com/questions/2883438/what-is-the-performance-penalty-of-xml-data-type-in-sql-server-when-compared-to – Tim Lehner May 03 '12 at 17:24
  • Using the `XML` datatype, your data is **not** stored in text format - it's "tokenized" and stored in a more optimal way. So yes - if you do have XML - do store it as `XML` (you can also use the XQuery methods directly on an `XML` column without conversions and messy stuff) – marc_s May 03 '12 at 18:17

1 Answers1

0

Storing this as XML will most likely result in smaller storage size than nvarchar (I have found this to be true, at least due to removing irrelevant white space and other such formatting). However, SQL server will check that the XML is well-formed before each insert and update, so you can potentially suffer a performance penalty at those times. I would argue that this check is also an advantage like any data-type check.

Tim Lehner
  • 14,813
  • 4
  • 59
  • 76
  • Hmmm, i never considered the downside that SQL Server might *alter* the XML; removing whitespace, [randomizing the order of attributes](http://stackoverflow.com/questions/10317671/internet-explorer-displaying-xml-attributes-in-random-order), etc. Is it a given that SQL Server will likely modify the original xml? – Ian Boyd May 03 '12 at 17:45
  • There are some ways to [retain your formatting](http://blogs.msdn.com/b/denisruc/archive/2005/09/12/464145.aspx), though I'm not a big fan (and would rather [re-format in c#](http://stackoverflow.com/questions/589943/net-make-xml-more-readable) or something anyway). This would presumably increase the size of the XML stored. I have not personally seen SQL Server alter the order of elements or attributes. – Tim Lehner May 03 '12 at 17:59
  • Well it's forbidden to alter the order of elements, or that would cause the XML to become invalid (based on its schema). But XML has no such ordering requirement on attributes, *"the order of attribute specifications in a start-tag or empty-element tag is not significant"* – Ian Boyd May 03 '12 at 18:10
  • Yes, I understand. marc_s perhaps said it best, above. Another way to look at it is that XML uses more CPU but less I/O. As always, you'd have to test in your own environment to see the exact trade-offs and determine if they are worth it. – Tim Lehner May 03 '12 at 19:43
  • Accepted answer as "Yes, but don't do it if you don't want your xml modified behind your back." – Ian Boyd May 16 '12 at 19:27