59

I'm trying to insert rows into a table using a select from XML. I think I'm close. Where am I going wrong?

declare @xmldata xml;
set @xmldata = '<Database>
                  <PurchaseDetails>
                    <PurchaseDetail>
                      <Upc>72594206916</Upc>
                      <Quantity>77</Quantity>
                      <PurchaseDate>9/2010</PurchaseDate>
                      <PurchaseCity>Dallas</PurchaseCity>
                      <PurchaseState>TX</PurchaseState>
                    </PurchaseDetail>
                    <PurchaseDetail>
                      <Upc>72594221854</Upc>
                      <Quantity>33</Quantity>
                      <PurchaseDate>12/2013</PurchaseDate>
                      <PurchaseCity>Nashville</PurchaseCity>
                      <PurchaseState>TN</PurchaseState>
                    </PurchaseDetail>
                  </PurchaseDetails>
                </Database>'

insert into PurchaseDetails
(Upc, Quantity, PurchaseDate, PurchaseCity, PurchaseState)
select
    x.Rec.value('Upc','char(11)'),
    x.Rec.value('Quantity','int'),
    x.Rec.value('PurchaseDate','varchar(7)'),
    x.Rec.value('PurchaseCity','varchar(50)'),
    x.Rec.value('PurchaseState','char(2)')
from @xmlData.nodes('//Database/PurchaseDetails/PurchaseDetail') as x(Rec)
birdus
  • 7,062
  • 17
  • 59
  • 89

6 Answers6

57

A co-worker had tackled a similar problem before. Here is what we came up with. NOT intuitive!

insert into PurchaseDetails
(Upc, Quantity, PurchaseDate, PurchaseCity, PurchaseState)
select
    pd.value('Upc[1]','char(11)'),
    pd.value('Quantity[1]','int'),
    pd.value('PurchaseDate[1]','varchar(7)'),
    pd.value('PurchaseCity[1]','varchar(50)'),
    pd.value('PurchaseState[1]','char(2)')
from @xmlData.nodes('//Database/PurchaseDetails') as x(Rec)
cross apply @xmlData.nodes('//Database/PurchaseDetails/PurchaseDetail') as i(pd)
birdus
  • 7,062
  • 17
  • 59
  • 89
  • 10
    Works perfectly; any idea what the reason behind the [1] is? – Jerod Venema Aug 05 '14 at 12:53
  • 7
    @jvenema, The reason behind the problem is that XQuery always returns a collection if you don't use [] at the end of it. So in your case pd.Value('sth') returns a collection, although nodes function in the From clause deliver only one row to the Value function By the way The Mikel's answer is also correct and more to the point. – Jami May 24 '15 at 14:48
  • 1
    Any idea what the cross apply is doing? x(Rec) isn't referenced in the query, but taking that piece out makes it not work. – Anov Sep 22 '15 at 17:35
  • This was surprisingly hard to find, works great..I don't understand how it works for unknown amount of records, is it recursive? – Dave Kelly Mar 03 '16 at 02:26
40

Try this!
query() then value()
run this in SQL Server and 100% worked
put a dot (.) first then the child tag.
PurchaseDetail tag exists 2 times so the dot (.) replaces the first and the second tag.
The dot can prevent using of [1] on XQuery.
The dot represents the first and the second PurchaseDetail tags.

INSERT INTO PurchaseDetails(Upc, Quantity, PurchaseDate, PurchaseCity, PurchaseState)
SELECT col.query('./Upc').value('.', 'char(11)'),
    col.query('./Quantity').value('.', 'int'),
    col.query('./PurchaseDate').value('.', 'varchar(7)'),
    col.query('./PurchaseCity').value('.', 'varchar(50)'),
    col.query('./PurchaseState').value('.', 'char(2)')
FROM @xmlData.nodes('/Database/PurchaseDetails/PurchaseDetail') as ref(col)

It is more simplified query so far.
See if it works

marion-jeff
  • 759
  • 7
  • 13
25

insert into PurchaseDetails(Upc, Quantity, PurchaseDate, PurchaseCity, PurchaseState)
select T.X.value('(Upc/text())[1]', 'char(11)'),
       T.X.value('(Quantity/text())[1]', 'int'),
       T.X.value('(PurchaseDate/text())[1]', 'varchar(7)'),
       T.X.value('(PurchaseCity/text())[1]', 'varchar(50)'),
       T.X.value('(PurchaseState/text())[1]', 'char(2)')
from @xmlData.nodes('/Database/PurchaseDetails/PurchaseDetail') as T(X)
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
  • This is the most elegant and efficient solution that should be the accepted answer. – max Jan 30 '20 at 22:06
14
select
    x.Rec.query('./Upc').value('.','char(11)')
    ,x.Rec.query('./Quantity').value('.','int')
    ,x.Rec.query('./PurchaseDate').value('.','varchar(7)')
    ,x.Rec.query('./PurchaseCity').value('.','varchar(50)')
    ,x.Rec.query('./PurchaseState').value('.','char(2)')
from @xmlData.nodes('/Database/PurchaseDetails/PurchaseDetail') as x(Rec)
Rand.Function
  • 770
  • 1
  • 7
  • 10
10

Struggling with a similar problem, and found that @birdus's answer didn't work if you have additional layers of nesting in your xml that you were referencing in your XQuery, e.g. supposing a slightly different XML shape, if you had

T.x.value('PurchasePlace/PurchaseCity[1]','varchar(50)')

you would still get the singleton error. Though @birdus's solution does work for this specific case a more generally applicable solution that combines the best of @birdus's & @Mikael-Eriksson's solution is to do:

insert into PurchaseDetails(Upc, Quantity, PurchaseDate, PurchaseCity, PurchaseState)
select T.X.value('(Upc)[1]', 'char(11)'),
T.X.value('(Quantity)[1]', 'int'),
T.X.value('(PurchaseDate)[1]', 'varchar(7)'),
T.X.value('(PurchaseCity)[1]', 'varchar(50)'),
T.X.value('(PurchaseState)[1]', 'char(2)')
from @xmlData.nodes('/Database/PurchaseDetails/PurchaseDetail') as T(X)

This combine's @birdus's omission of /text(), which is superfluous, but adds @Mikael-Eriksson's parentheses around the element selector, to allow multiple element selectors as in my modified example which becomes:

T.x.value('(PurchasePlace/PurchaseCity)[1]','varchar(50)')

The reason for this, that a few have asked about, is not that @birdus's version returns something other than a singleton in any of the examples discussed here, but that it might. Per Microsoft Docs:

Location steps, function parameters, and operators that require singletons will return an error if the compiler cannot determine whether a singleton is guaranteed at run time.

pbz
  • 699
  • 6
  • 6
  • 1
    I'm having a similar issue but for what it's worth, I removed the `/text()`, and it caused a major lag in time in my data set. With `/text()`, it took 25 seconds to insert 55K records into the table. Without `/text()`, I stopped the query at 53 seconds because the additional time wasn't worth it. – Dan Oct 18 '18 at 13:13
  • Wrapping the xpath in parenthesis worked for me, followed by the first instance attribute: XmlPayLoad.value( '(/root/child/wantedvalue)[1]','varchar' ) – DubMan Nov 13 '18 at 15:52
5

To address the question of why the need for the positional predicate (i.e., [1]) within the XQuery string literal, like @pbz indicated, a singleton is required and therefore must be guaranteed. To add more substance to @pbz's answer, see below.

Per Microsoft's SQL Docs:

In the following example, an XML instance is stored in a variable of xml type. The value() method retrieves the ProductID attribute value from the XML. The value is then assigned to an int variable.

DECLARE @myDoc xml  
DECLARE @ProdID int  
SET @myDoc = '<Root>  
<ProductDescription ProductID="1" ProductName="Road Bike">  
<Features>  
  <Warranty>1 year parts and labor</Warranty>  
  <Maintenance>3 year parts and labor extended maintenance is available</Maintenance>  
</Features>  
</ProductDescription>  
</Root>'  

SET @ProdID =  @myDoc.value('(/Root/ProductDescription/@ProductID)[1]', 'int' )  
SELECT @ProdID  

Value 1 is returned as a result.

Although there is only one ProductID attribute in the XML instance, the static typing rules require you to explicitly specify that the path expression returns a singleton. Therefore, the additional [1] is specified at the end of the path expression. For more information about static typing, see XQuery and Static Typing.

Following that link then leads us to:

As mentioned earlier, type inference frequently infers a type that is broader than what the user knows about the type of the data that is being passed. In these cases, the user has to rewrite the query. Some typical cases include the following:

...

  • The type infers a higher cardinality than what the data actually contains. This occurs frequently, because the xml data type can
    contain more than one top-level element
    , and an XML schema collection cannot constrain this. In order to reduce the static type and
    guarantee that there is indeed at most one value being passed, you
    should use the positional predicate [1]
    . For example, to add 1 to the value of the attribute c of the element b under the top-level a
    element, you must write (/a/b/@c)[1]+1. Additionally, the DOCUMENT
    keyword can be used together with an XML schema collection.
WutDuk
  • 71
  • 1
  • 7