5

I have a xml which I want it to be extracted using OpenXML within SQL Server

Here is the sample XML

<row>
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  <alien_num xsi:nil="true" />
  <last_name>A</last_name>
  <first_name>B</first_name>
  <middle_init xsi:nil="true" />
  <drivers_license_num xsi:nil="true" />
  <gpa_highschool xsi:nil="true" />
  <created_dt>2006-07-13T11:15:08.320</created_dt>
  <created_how>4</created_how>
  <modified_dt>2008-02-14T00:00:00</modified_dt>
  <modified_by>4</modified_by>
  <primary_street2 xsi:nil="true" />
  <primary_street3 xsi:nil="true" />
  <primary_country xsi:nil="true" />
  <email_address xsi:nil="true" />
  <address_start_dt xsi:nil="true" />
  <address_end_dt xsi:nil="true" />
  <entrance_iv_dt xsi:nil="true" />
  <entrance_iv_by xsi:nil="true" />
  <exit_iv_dt>2006-11-02T00:00:00</exit_iv_dt>
  <exit_iv_by>156</exit_iv_by>
  <foreign_address_indicator>N</foreign_address_indicator>
  <foreign_postal_code xsi:nil="true" />
  <pin>J27841</pin>
  <web_id>J08614   </web_id>
  <prior_name xsi:nil="true" />
  <orig_eps xsi:nil="true" />
  <web_role>STU1</web_role>
  <heal_limit_flag>N</heal_limit_flag>
  <email_address_2>test@test.com</email_address_2>
  <cellular_telephone>415</cellular_telephone>
  <alt_loan_debt xsi:nil="true" />
  <web_last_login xsi:nil="true" />
  <foreign_country_code xsi:nil="true" />
  <entrance_iv_dt_grad_plus xsi:nil="true" />
  <entrance_iv_by_grad_plus xsi:nil="true" />
  <failed_logins>0</failed_logins>
  <hispanic xsi:nil="true" />
  <race xsi:nil="true" />
  <primary_phone_number_intl xsi:nil="true" />
  <security_version>0</security_version>
  <failed_challenge_response>0</failed_challenge_response>
  <require_pin_reset xsi:nil="true" />
</row>

The query should extract into 3 fields for each row

  1. FieldName
  2. FieldValue
  3. IsNull

For example the first row should be

  • FieldName = student_token - The node name would be the field name
  • FieldValue = 7
  • IsNull = false - IsNull is based on the attribute xsi:nil="true"

How can I do this?

Joundill
  • 6,828
  • 12
  • 36
  • 50
Rush
  • 97
  • 1
  • 1
  • 7
  • The tag name is the FieldName and ISNull is based on the Attribute xsi:nil="true" – Rush Mar 06 '12 at 14:26
  • The result should take the element name from each node and return that as the FieldName. For example the first row fieldname would be student_token and the next one would be student_ssn and the next would be alternate_id .... – Rush Mar 06 '12 at 14:30
  • You mean use the first element name and use that as the FieldName? There are multiple xsi:nil. – paparazzo Mar 06 '12 at 14:41

3 Answers3

11

Sample data with namespace added.

declare @xml xml 
set @xml = 
'<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  <alien_num xsi:nil="true" />
  <last_name>A</last_name>
  <first_name>B</first_name>
  <middle_init xsi:nil="true" />
  <drivers_license_num xsi:nil="true" />
  <gpa_highschool xsi:nil="true" />
  <created_dt>2006-07-13T11:15:08.320</created_dt>
  <created_how>4</created_how>
  <modified_dt>2008-02-14T00:00:00</modified_dt>
  <modified_by>4</modified_by>
  <primary_street2 xsi:nil="true" />
  <primary_street3 xsi:nil="true" />
  <primary_country xsi:nil="true" />
  <email_address xsi:nil="true" />
  <address_start_dt xsi:nil="true" />
  <address_end_dt xsi:nil="true" />
  <entrance_iv_dt xsi:nil="true" />
  <entrance_iv_by xsi:nil="true" />
  <exit_iv_dt>2006-11-02T00:00:00</exit_iv_dt>
  <exit_iv_by>156</exit_iv_by>
  <foreign_address_indicator>N</foreign_address_indicator>
  <foreign_postal_code xsi:nil="true" />
  <pin>J27841</pin>
  <web_id>J08614   </web_id>
  <prior_name xsi:nil="true" />
  <orig_eps xsi:nil="true" />
  <web_role>STU1</web_role>
  <heal_limit_flag>N</heal_limit_flag>
  <email_address_2>test@test.com</email_address_2>
  <cellular_telephone>415</cellular_telephone>
  <alt_loan_debt xsi:nil="true" />
  <web_last_login xsi:nil="true" />
  <foreign_country_code xsi:nil="true" />
  <entrance_iv_dt_grad_plus xsi:nil="true" />
  <entrance_iv_by_grad_plus xsi:nil="true" />
  <failed_logins>0</failed_logins>
  <hispanic xsi:nil="true" />
  <race xsi:nil="true" />
  <primary_phone_number_intl xsi:nil="true" />
  <security_version>0</security_version>
  <failed_challenge_response>0</failed_challenge_response>
  <require_pin_reset xsi:nil="true" />
</row>'

Using openxml.

declare @idoc int
exec sp_xml_preparedocument @idoc out, @xml, '<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>'

select FieldName, 
       FieldValue,
       isnull([IsNull], 0)
from openxml(@idoc, '/row/*',1) 
  with (
         FieldName  varchar(50) '@mp:localname',
         FieldValue varchar(50) '.',
         [IsNull]   bit         '@xsi:nil'
       )

exec sp_xml_removedocument @idoc

Using the XML data type:

;with xmlnamespaces('http://www.w3.org/2001/XMLSchema-instance' as ns)
select T.N.value('local-name(.)', 'varchar(50)') as FieldName,
       T.N.value('.', 'varchar(50)') as FieldValue,
       isnull(T.N.value('@ns:nil', 'bit'), 0) as [IsNull]
from @xml.nodes('/row/*') as T(N)
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
  • @marc_s - Is the URL the same in the XML as in the `sp_xml_preparedocument` statement? – Mikael Eriksson Mar 06 '12 at 14:46
  • 1
    oops - no! :-) I had set it myself to just whatever - of course that doesn't match up with your setting! :-) Sorry, my bad..... works like a charm now! – marc_s Mar 06 '12 at 15:07
4

Not sure if you have that XML as a SQL variable or inside a table - question is very unclear .....

If you have it as an SQL variable, then try something like this (note: you must declare the xsi prefix somehow - otherwise SQL Server's XML processor won't even look at your XML document):

DECLARE @input XML = '<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  .........
</row>'

;WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as xsi)
SELECT
FieldName = T.C.value('local-name(.)', 'varchar(50)'),
FieldValue = T.C.value('(.)[1]', 'varchar(500)'),
IsNIL = ISNULL(T.C.value('(@xsi:nil)[1]', 'bit'), 0)
FROM
@Input.nodes('/row/*') AS T(C)

This gives me an output something like:

FieldName      FieldValue  IsNIL
student_token    7          0
student_ssn      552        0
alternate_id     20         0
old_ssn                     1
.....

Of course, all output is going to be of type varchar(500) now in the FieldValue column....

Updated my answer, based on Mikael Eriksson's answer, to include the IsNIL handling, too. Thanks Mikael for the inspiration! You deserve the nod and the accept vote!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • The reason for using OpenXml was performance. Isn't this slower than openxml? – Rush Mar 06 '12 at 15:29
  • @Rush - I think that would be an interesting question on SO. It's been on my todo list for a while to check what is the fastest. I have read that they should be almost the same but ... needs some testing. – Mikael Eriksson Mar 06 '12 at 15:40
  • @Rush: no, not necessarily. Both approaches have their pros and cons - but the OpenXML is definitely more resource intensive (with the call to `sp_xml_preparedocument`) and less intuitive (at least to me) – marc_s Mar 06 '12 at 16:26
0

You can convert your xml file to json and then use OPENJSON instead. Take a look at openjson : SELECT star on how to use OPENJSON without having to list column names.

To convert an xml file to json you can use sp_execute_external_script and the xmltodict python module. Take a look at Importing Python Libraries to SQL Server to import the python module and deal with YAPI (Yet Another Python Install) issues.

Ludovic Aubert
  • 9,534
  • 4
  • 16
  • 28