0

I have an XML file with SHIFT-JIS encoding as below:

<?xml version="1.0" encoding="SHIFT-JIS" standalone="yes"?>
<海外管理ファイル><PO番号>GV05097</PO番号><データベース><PO><Tbl_PO_H PO番号="GV05097"><DATA><PO番号 TYPE="200" LENGTH="13">GV05097</PO番号></DATA></Tbl_PO_H></PO></データベース></海外管理ファイル>

And I use SQL store procedure to insert it into SQL table:

alter PROCEDURE [dbo].[proc_TBL_PO_H_LoadXMLPO]
@xml  XML
AS  
    BEGIN
        SET NOCOUNT ON;
        INSERT INTO [ENVIETNAMPO].[dbo].[TBL_PO_H]
              SELECT
                  TBL_PO_H.value('(PO番号/text())[1]','varchar(13)') AS PO番号, --TAG
                  TBL_PO_H.value('(PO発行日/text())[1]','varchar(10)') AS PO発行日                  
              FROM
                 @xml.nodes('/海外管理ファイル/データベース/PO/Tbl_PO_H/DATA')AS TEMPTABLE(TBL_PO_H)      
    END

C# code for Load XML button:

string pathUser = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);
            string pathDesktop = Path.Combine(pathUser, "Desktop");
            var xmlfilename = string.Empty;
            var xmlfilePath = string.Empty;
            //var dt = new DataTable();           
            var sqlConn = new SqlConnection(strConStr);
            try
            {
                    openFileDialog1.InitialDirectory = @pathDesktop;
                    openFileDialog1.Title = "Browse XML PO File";
                    openFileDialog1.Filter = "XML files (*.xml)|*.xml|All files (*.*)|*.*";
                    openFileDialog1.CheckFileExists = true;
                    openFileDialog1.CheckPathExists = true;
                    openFileDialog1.ShowHelp = true;
                    openFileDialog1.FileName = "*.xml";
                if (openFileDialog1.ShowDialog() == DialogResult.OK)
                {
                    xmlfilename =openFileDialog1.SafeFileName;
                    xmlfilePath = pathDesktop +"\\"+ xmlfilename;
                    string xml = File.ReadAllText(xmlfilePath, Encoding.GetEncoding("SHIFT-JIS"));
                    sqlConn.Open();

                    var cmd = new SqlCommand("proc_TBL_PO_H_LoadXMLPO", sqlConn);
                    cmd.CommandType = CommandType.StoredProcedure;
                    cmd.Parameters.AddWithValue("@xml", xml);
                    SqlCommand arithabortCommand = new SqlCommand("SET ARITHABORT ON", sqlConn);
                    arithabortCommand.ExecuteNonQuery();  
                    cmd.ExecuteNonQuery(); 
                    sqlConn.Close();
                }

                    MessageBox.Show("PO XML File has been imported successfully.", "Information",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);
            }
            catch (Exception ex)
            {

                MessageBox.Show(ex.ToString());
            }

But when loading, an error occur as the following, I tried to read XML file with correct encoding as the XML file. Please help me. Thank you!

System.Data.SqlClient.SqlException 0x80131904: parsing XML Line 1, character 59. Cannot swicth I- code ....
Shnugo
  • 66,100
  • 9
  • 53
  • 114
Cát Tường Vy
  • 398
  • 6
  • 32
  • I think Shift-JIS encoding setting uses underscore instead of dash (unlike UTF-8): `string xml = File.ReadAllText(xmlfilePath, Encoding.GetEncoding("shift_jis"));`. Can you provide what result returned by `xml` string to make sure? – Tetsuya Yamamoto Dec 11 '17 at 09:40
  • Tried "shift_jis" and string xml = File.ReadAllText(xmlfilePath, Encoding.GetEncoding(932)); but it still get the same error – Cát Tường Vy Dec 11 '17 at 09:43
  • Well, your issue is related to `XML` data type used as parameter in stored procedure. Try getting rid of `AddWithValue` and use explicit data type: `cmd.Parameters.Add("@xml", SqlDbType.Xml).Value = xml;`. This issue may be similar to consider with: https://stackoverflow.com/questions/3760788/how-to-solve-unable-to-switch-the-encoding-error-when-inserting-xml-into-sql-s. – Tetsuya Yamamoto Dec 11 '17 at 10:01
  • I tried but it was unsuccessful. I think I must try to eliminate the XML header by C# code. It will be solved my problem – Cát Tường Vy Dec 12 '17 at 02:37

1 Answers1

1

If I get this correctly you are passing the XML as is from C# into a SQL-Server stored procedure. You do not state the actual RDBMS, but I assume this is SQL-Server (due to the error message about the "switch" of an encoding). Hope this is correct, if not, this might help with other databases too...

Some things to know:

  • Within C# XML is either the string you see, or an hierarchically organized document like XmlDocument.
  • In any case, the XML is transposed into its string representation when you pass it over to the database (serialisation)
  • All strings in C# are unicode. You can define a special encoding and shift an encoded string into a byte array, but the string type itself is unicode in any case.
  • SQL Server will take the string and parse it into the native XML data type, which is a hierarchy table internally.
  • All parts of the XML (tag names, content, ...) are stored in SQL-Servers NVARCHAR, which is kind of unicode (UCS-2 actually).
  • SQL Server will - in any case - not allow you to store this declaration together with the XML. It will be omitted in any case...

So what is going on here:

You hand over a string, which is unicode actually, but the string tells the engine: No, I'm SHIFT-JIS!. This string is a liar :-D

This declaration is only needed if you store this XML in any byte container (like a file) and you want to tell a reader how to decode the content.

But between C# and the database there's no need to mess around: The string is plain unicode and will be taken into (almost) plain unicode.

Easy solution:

Pass the XML without the <?xml blah?> declaration.

UPDATE

About your question "how to strip of the declaration"?

At the moment when you've got the XML

string xml = File.ReadAllText(xmlfilePath, Encoding.GetEncoding("SHIFT-JIS"));

You do not have an XML (native type), but you've got a (unicode) string which looks like an XML.

You can use any string method here:

  • Use .IndexOf() to find ?> (end of declaration) and use .Substring() to cut away the declaration entirely
  • Use .Replace() to change your encoding to encoding="utf-16"
  • Use RegEx, whatever you like...

On the other side you might pass the string to the stored procedure as NVARCHAR(MAX) (the SP's parameter) and do the cut-off there before SQL-Server tries to take this as XML. But I'd suggest to solve this on C# side.

Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • In fact, removing `encoding="SHIFT-JIS"` seems to solve the XML conversion error (tested with MSSQL 2008) – devio Dec 11 '17 at 22:18
  • I am having an issue that we receive xml files from other branches so I do not want to mannually remove . Any way to skip this header by C# code? – Cát Tường Vy Dec 12 '17 at 02:35
  • I solved this issue now but i am facing with another problem is convert XML datatype <下代 TYPE="6" LENGTH="8">0.009下代> to money type in SQL . I used TBL_PO_M.value('(下代)[1]','money') AS 下代 but cannot. Thanks – Cát Tường Vy Dec 12 '17 at 08:07
  • @user3035133 Please avoid follow up questions. Next time please start a new question. This is very easy: In `T-SQL` you must place an `N` before all your literals. Try this: `SELECT '(下代)[1]' AS without_N, N'(下代)[1]' AS with_N` – Shnugo Dec 12 '17 at 08:16