I need to export a SQL Server table into a CSV file, one of the issues is that some of the values in a column could contains special characters, such as new line, which will produce an extra row in the CSV file. In order to fix the issue I have a function to use to remove special characters:
CREATE FUNCTION dbo.udf_tokenize(@input VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN (SELECT CAST('<r><![CDATA[' + @input + ' ' + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)'));
END
GO
UPDATE dbo.prospect
SET sNotes = dbo.udf_tokenize(sNotes)
It works fine for most of the cases, except the following case: SNotes column value is:
2600/month gross income (CVS—pharmacy tech—
however, the len(sNotes) returns 169. and when I apply the function to this case, it returns error message:
Msg 9420, Level 16, State 1, Line 11 XML parsing: line 1, character 56, illegal xml character
Please advise how to handle this case. Thanks
Thank you all for your comments. I dig into this issue a bit further during the weekend, and find something new. The actual value inside the column is actually like this : Thank you all for your comments. I dig into this issue a bit further during the weekend. And I notice that the ACTUAL inside the column is `
"2600/month gross income (CVS—pharmacy tech— �1 yr and 5 mo. tenure) with $350 addl in child support. Occupants: myself and 7-yr old daughter. No evictions. No felonies.,"
For some reason, my SSMD only shows the
2600/month gross income (CVS—pharmacy tech—
and truncate the rest part, so the real question is how could I remove that � from the string. Thanks again
`