I'm having some issues representing 
, a valid UTF-16 construct and also apparently valid in a Windows filename, in XML for consumption in SQL Server XML (2012).
Take this for example:
DECLARE @Xml xml;
SET @Xml = N'<?xml version="1.0"?><FileName>풜〣</FileName>'
-- Result: XML parsing: line 1, character 41, illegal xml character
This is, however, legal Unicode (a "Unicode Noncharacter" ''): http://www.fileformat.info/info/unicode/char/ffff/index.htm
So, I tried this:
DECLARE @Xml xml;
SET @Xml = N'<?xml version="1.0" encoding="utf-16"?><FileName>풜〣</FileName>'
-- Result: XML parsing: line 1, character 59, illegal xml character
So - how am I supposed to represent this file name accurately in XML? I can't just remove the characters, they are indeed 
characters as reported by Get-ChildItem, and I will need to retain a handle to this file.
I can get the XML parsing to work by replacing 
with ï¿¿
which is the UTF-8 representation of \uFFFF
according to this link I then try to take this XML and insert it to a nvarchar
column, and I need this to be the proper representation of the filename.
DECLARE @Xml xml;
SET @Xml = N'<?xml version="1.0"?><FileName>풜〣ï¿¿ï¿¿</FileName>'
SELECT F.Item.value('.', 'nvarchar(2000)') FROM @Xml.nodes('//FileName') as F(Item)
-- Returns 풜〣ï¿¿ï¿¿ (not correct)