0

I have a stored procedure where I am using sp_xml_preparedocument to handle XML data. But due to some invalid characters like

Ex: 1. INGENIERÍA
    2. Engineer'

in XML data, SQL throws an exception as "An invalid character can be found in the text content".

Screen shot of the exception attached

My Stored procedure look like

    DECLARE @idoc INT
    DECLARE @doc XML
    SET @doc = @DocElements -- @Docelements will have XML data
    EXEC sp_xml_preparedocument @idoc OUTPUT, @doc

    SELECT BadgeNo FROM OPENXML (@idoc, '/DocumentElement/PEScoreUpdate',2) WITH(BadgeNo  VARCHAR(50)))AND
    ASRYEAR=YEAR(GETDATE()) And IsPlanDeleted<>1
    EXEC sp_xml_removedocument @idoc 

Sampel XML data

  <DocumentElement>
  <PEScoreUpdate>
    <Badge_x0020_No>105731</Badge_x0020_No>       
    <Last_x0020_Name>Vijaya Kumar</Last_x0020_Name>
    <First_x0020_Name>Sanjay Kumar</First_x0020_Name>
    <BOC>Onshore E&amp;C</BOC>
    <Emp_x0020_Class>White Collar</Emp_x0020_Class>
    <Site>INGENIERÍA PROJECT     Secondment</Site>       
  </PEScoreUpdate>
</DocumentElement>

I am getting this XML data from an excel sheet with thousands of data so, it's practically not possible to search for invalid text and correct it manually. Anyone guide me how to handle this invalid char in SQL procedure. Is there any way to replace this invalid character but I am processing thousands of data. will it affect performance? Anyone guide me. Thanks in Advance

Arun D
  • 444
  • 3
  • 7
  • 23

1 Answers1

2

One point is, that FROM OPENXML (together with the SPs to prepare and remove a dcoument) is outdated and should not be used any more (rare exceptions exist).

To your actual question:

I'm pretty sure, that this issue is not related to FROM OPENXML, neither to sp_xml_preparedocument. You did not show, how you are passing in the XML. But - assumeably - you do this on string-level with a VARCHAR variable (or with an XML-literal. Try to use NVARCHAR or place a leading N before you literal.

If I'm right, you find details here: https://stackoverflow.com/a/42683643/5089204

UPDATE How to read XML today...

Try to read your XML like this

DECLARE @doc XML=
N'<DocumentElement>
  <PEScoreUpdate>
    <Badge_x0020_No>105731</Badge_x0020_No>       
    <Last_x0020_Name>Vijaya Kumar</Last_x0020_Name>
    <First_x0020_Name>Sanjay Kumar</First_x0020_Name>
    <BOC>Onshore E&amp;C</BOC>
    <Emp_x0020_Class>White Collar</Emp_x0020_Class>
    <Site>INGENIERÍA PROJECT     Secondment</Site>       
  </PEScoreUpdate>
</DocumentElement>';

SELECT u.value(N'(Badge_x0020_No)[1]',N'int') AS Badge_x0020_No
      ,u.value(N'(First_x0020_Name)[1]',N'nvarchar(max)') AS First_x0020_Name
      ,u.value(N'(BOC)[1]',N'nvarchar(max)') AS BOC
      ,u.value(N'(Emp_x0020_Class)[1]',N'nvarchar(max)') AS Emp_x0020_Class
      ,u.value(N'(Site)[1]',N'nvarchar(max)') AS [Site]
FROM @doc.nodes(N'/DocumentElement/PEScoreUpdate') AS A(u)

UPDATE 2

Your code shows /DocumentElement/BasicInformation as XPath, but this doesn't show up in your XML?

Furthermore, the _x0020_ within your elements names is coming from blanks in your Excel's columns names.

There are several places, where your issue might come from...

Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • How should I read an xml file with the following format? It just has one node, e.g. and each row is like this. – Nicholas Humphrey May 25 '19 at 15:53
  • @NicholasHumphrey A comment is not the best place for an entirely new question... Please start a new question. You can place a link in comments to alert people who might know the answer... But sure enough such a question will be answered in minutes. – Shnugo May 27 '19 at 07:24