0

I am attempting to parse some XML using SQL Server Express 2017. I can't post a sample of the XML because it contains patient data. When I use the cross apply function to attempt to align multiple XML data elements with the left, I get everything in the XML as a single string of characters. I have tried everything I can find on StackOverflow and Google, and nothing works. I've tried multiple select statements, outer apply, distinct, and everything either has bad syntax or produces the same result. Here is the query I am attempting to use.

USE PatientSurvey
CREATE TABLE PG_Questions(Survey_ID int, Client_ID int, Service2 varchar(6),
RecDate date, DisDate date, Varname3 varchar(200), [Value] int);
GO

Declare @fileData XML

-- import the file contents into the variable
Select @fileData=BulkColumn from 
OpenRowSet(Bulk'C:\path_here\XML_test.txt',Single_blob) x;

insert into PG_Questions
  (Survey_ID, Client_ID, Service2, RecDate, DisDate, Varname3) 
select

Data.xData.query('SURVEY_ID').value('.','int'),
Data.xData.query('CLIENT_ID').value('.','int'),
Data.xData.query('SERVICE').value('.','varchar(6)'),
Data.xData.query('RECDATE').value('.','date'),
Data.xData.query('DISDATE').value('.','date'),
Data.xData.query('ANALYSIS/RESPONSE/VARNAME').value('.','varchar(200)')
--  Data.xData.query('VALUE').value('.','int')
from @fileData.nodes('/DATA_EXPORT/PATIENTLEVELDATA') as Data(xData)
CROSS APPLY Data.xData.nodes('/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE/VARNAME') 
as Data2(xData2)
--CROSS APPLY @fileData.nodes('/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE/VALUE')
   as Data3(xData3)


SELECT * FROM PG_Questions;
GO

DROP TABLE PG_Questions;
GO

Here is what I get. The Varname3 column just repeats all the data in the XML without breaking it up. Is it possible to get the cross apply to do this properly? Disregard the Value column, I am waiting to get this column working first. Each cell is supposed to be a two character code, but the SQL is running them all together. How do I break it up into rows in a single column?

Data

Sample XML:

<DATA_EXPORT>
<PATIENTLEVELDATA>
    <SURVEY_ID>00000001</SURVEY_ID>
    <CLIENT_ID>0000002</CLIENT_ID>
    <SERVICE>IN</SERVICE>
    <RECDATE>2018-01-12</RECDATE>
    <DISDATE>2017-06-16</DISDATE>
<ANALYSIS>
    <RESPONSE>
        <VARNAME>A1</VARNAME>
        <VALUE>4</VALUE>
    </RESPONSE>
    <RESPONSE>
        <VARNAME>A2</VARNAME>
        <VALUE>4</VALUE>
    </RESPONSE>
    <RESPONSE>
        <VARNAME>D1</VARNAME>
        <VALUE>4</VALUE>
    </RESPONSE>
    <RESPONSE>
        <VARNAME>D2</VARNAME>
        <VALUE>5</VALUE>
    </RESPONSE>
    <RESPONSE>
        <VARNAME>D3</VARNAME>
        <VALUE>4</VALUE>
    </RESPONSE>
    <RESPONSE>
        <VARNAME>I1</VARNAME>
        <VALUE>5</VALUE>
</ANALYSIS>
</PATIENTLEVELDATA>
</DATA_EXPORT>
  • Please provide a reduced (!) and anonymised (!) XML to reproduce your issue. One first hint might be to use `text()[1]` instead of the `.` in your value methods. – Shnugo Jul 31 '18 at 16:42
  • Why a `.value()` with `.` will return all data *below* this node is described [here](https://stackoverflow.com/a/43242238/5089204). Copy the example there into a new query window and analyse the output. – Shnugo Jul 31 '18 at 16:44
  • Hi Shungo, I have added a sample of XML to the post. I think I understand that CROSS APPLY appends the entire set of values as a single row to the table when there are multiple children. Is there any way to split it up, such as with a PIVOT, SELECT or DISTINCT? Or do I need to write some kind of function? Oh and replacing the value parameter with text()[1] resulted in NULL entries to the column. – Joshua Woleben Jul 31 '18 at 19:14

2 Answers2

1

Do you need something like this?

DECLARE @xml XML=
'<DATA_EXPORT>
  <PATIENTLEVELDATA>
    <SURVEY_ID>00000001</SURVEY_ID>
    <CLIENT_ID>0000002</CLIENT_ID>
    <SERVICE>IN</SERVICE>
    <RECDATE>2018-01-12</RECDATE>
    <DISDATE>2017-06-16</DISDATE>
    <ANALYSIS>
      <RESPONSE>
        <VARNAME>A1</VARNAME>
        <VALUE>4</VALUE>
      </RESPONSE>
      <RESPONSE>
        <VARNAME>A2</VARNAME>
        <VALUE>4</VALUE>
      </RESPONSE>
      <RESPONSE>
        <VARNAME>D1</VARNAME>
        <VALUE>4</VALUE>
      </RESPONSE>
      <RESPONSE>
        <VARNAME>D2</VARNAME>
        <VALUE>5</VALUE>
      </RESPONSE>
      <RESPONSE>
        <VARNAME>D3</VARNAME>
        <VALUE>4</VALUE>
      </RESPONSE>
      <RESPONSE>
        <VARNAME>I1</VARNAME>
        <VALUE>5</VALUE>
      </RESPONSE>
    </ANALYSIS>
  </PATIENTLEVELDATA>
</DATA_EXPORT>';

SELECT @xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/SURVEY_ID/text())[1]','nvarchar(100)') AS Survey_Id
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/CLIENT_ID/text())[1]','nvarchar(100)') AS Client_Id
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/SERVICE/text())[1]','nvarchar(100)') AS [Service]
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/RECDATE/text())[1]','date') AS RecDate
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/DISDATE/text())[1]','date') AS DisDate
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE[(VARNAME/text())[1]="A1"]/VALUE/text())[1]','int') AS Response_A1
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE[(VARNAME/text())[1]="A2"]/VALUE/text())[1]','int') AS Response_A2
      ,@xml.value('(/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE[(VARNAME/text())[1]="D1"]/VALUE/text())[1]','int') AS Response_D1
      --more of them

this would be a more tabular query:

SELECT ld.value('(SURVEY_ID/text())[1]','nvarchar(100)') AS Survey_Id
      ,ld.value('(CLIENT_ID/text())[1]','nvarchar(100)') AS Client_Id
      ,ld.value('(SERVICE/text())[1]','nvarchar(100)') AS [Service]
      ,ld.value('(RECDATE/text())[1]','date') AS RecDate
      ,ld.value('(DISDATE/text())[1]','date') AS DisDate
      ,r.value('(VARNAME/text())[1]','nvarchar(100)') VarName
      ,r.value('(VALUE/text())[1]','int') VarValue
FROM @xml.nodes('/DATA_EXPORT/PATIENTLEVELDATA') A(ld)
CROSS APPLY ld.nodes('ANALYSIS/RESPONSE') B(r);
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • I think the number of rows is indeterminate and can vary, so I don't know if I can use a query like the one above. Unless I am misreading it? Edit: I tried your second query and it worked. Thank you so much! – Joshua Woleben Jul 31 '18 at 19:58
0

Since I was too slow with my answer I still post it to show the OP he was not that far away from the desired result:

select
Data.xData.query('SURVEY_ID').value('.','int'),
Data.xData.query('CLIENT_ID').value('.','int'),
Data.xData.query('SERVICE').value('.','varchar(6)'),
Data.xData.query('RECDATE').value('.','date'),
Data.xData.query('DISDATE').value('.','date'),
Data2.xData2.query('VARNAME').value('.', 'varchar(200)'),
Data2.xData2.query('VALUE').value('.','int')
from @fileData.nodes('/DATA_EXPORT/PATIENTLEVELDATA') as Data(xData)
CROSS APPLY @fileData.nodes('/DATA_EXPORT/PATIENTLEVELDATA/ANALYSIS/RESPONSE') as Data2(xData2)
Mirco Ellmann
  • 983
  • 7
  • 8
  • Do you really want to use the full path `/DATA_EXPORT/PATIENTLEVELDATA/` in `APPLY`? And there is no reason to use `query()` and `.value()` in this combination... – Shnugo Jul 31 '18 at 20:11
  • I just fixed the original code to get the desired result. – Mirco Ellmann Jul 31 '18 at 20:13