2

I have a lot of xmls with pdf data that is them and encoded with B64 encoding.

<Document>
    <component>
        <nonXMLBody>
          <text mediaType="application/pdf" representation="B64">JVBERi0xLjMNJf////
          </text>
        </nonXMLBody>
      </component>
</Document>

I have two problems: 1) figuring out the right syntax to get all of the data out. I keep getting truncated versions. I've tried varchar(max) and varbinary.

SELECT x.value('(component/nonXMLBody/text/text())[1]','varchar(max)') as 
FROM @XML.nodes('/Document') as Addr (x))

2) How to decode the B64 data.

I found a post that seems like it is close to what I need but I'm still stuck. Base64 encoding in SQL Server 2005 T-SQL

Thanks in advance for any help

Community
  • 1
  • 1
  • 1) Take care of blanks and proper concatenating of rows of B64 text. Try using `fn:string()` instead of `text()` - see http://blog.davidcassel.net/2011/06/text-fnstring-and-fndata/ – CiaPan Sep 23 '14 at 12:49

1 Answers1

0

As far as the XQuery part is concerned, you can (should?) cast the Base64 inside the document to an xs:base64Binary with

xs:base64Binary((component/nonXMLBody/text)[1])

assuming the document is bound with the context item as you seem to assume. Without the cast, the contents of the tag will be untyped, or a string at best.

Once you have this xs:base64Binary atomic item, I am not familiar with the binding to SQL Server types and looking at its documentation, it seems that varbinary(max) could do the trick? Something like:

SELECT x.value('xs:base64Binary((component/nonXMLBody/text)[1])','varbinary(max)') as 
FROM @XML.nodes('/Document') as Addr (x))
Ghislain Fourny
  • 6,971
  • 1
  • 30
  • 37