1

I'm pretty new to using XML with TSQL, and recently ran into an issue that I can't think around and need your help with.

Scenario: I have a query built that returns quite a few columns, one of which contains a UNC path to an .xml file. For simplicity's sake, lets just say there's 2 columns: GUID, filePath

Example values:
GUID | filePath
0001 | \\server\folder\file1.xml
0002 | \\server\folder\file2.xml
0003 | \\server\folder\file3.xml

Goal: There are three field values in each xml file that I want returned (shown below) but there are additional fields over than just the three I want.

xml:

<form>
  <field>
    <name>TextboxAllocation3</name>
    <value>0</value>
  </field>
  <field>
    <name>TextboxAllocation1</name>
    <value>0</value>
  </field>
  <field>
    <name>TextboxAllocation2</name>
    <value>0</value>
  </field>
...
</form>

Question: How could I craft a query that would return:

GUID, TextboxAllocation1, TextboxAllocation2, TextboxAllocation3, when every GUID has a different filepath?

What I've tried:
• Using openrowset, but specifying the target can't be a variable (or in this case, it can't be the filePath from the query), it must be text, which lead me down the path of dynamic SQL which quickly turned into an amalgamation of spaghetti that I'm realizing I can't think through right now.

Matt H
  • 640
  • 4
  • 17
  • Using to find `t sql open file` turned up [this](https://www.simple-talk.com/sql/t-sql-programming/reading-and-writing-files-in-sql-server-using-t-sql/) and [this](https://stackoverflow.com/questions/12502213/how-to-read-a-text-file-using-t-sql). Perhaps a little more hunting for a means to read a file in general will get you to a happy place. (Kudos for a high quality first question!) – HABO Jun 10 '17 at 03:11
  • @HABO Thank you, I definitely did my google searching prior to posting! That's how I started down the `openrowset` path. :] Thanks for the compliment, doesn't feel like my first post, since I'm on the site constantly. – Matt H Jun 12 '17 at 15:13

2 Answers2

1

Your question is two folded:

  • read the files with dynamically set file paths
  • Find a query to read from the XMLs

Part 1: Read the files

Try this:

DECLARE @mockup TABLE([GUID] VARCHAR(100),filePath VARCHAR(100));
INSERT INTO @mockup VALUES
 ('0001','\\YourPath\File1.xml')
,('0002','\\YourPath\File2.xml')
,('0003','\\YourPath\File3.xml');

--Use a physically created table as staging table

CREATE TABLE StagingFileContent([GUID] VARCHAR(100),FileContent VARBINARY(MAX));

--A cursor-loop will read the file names and call OPENROWSET with dynamic SQL

DECLARE @g VARCHAR(100),@fp VARCHAR(100);
DECLARE @cmd VARCHAR(MAX);

DECLARE cur CURSOR FOR SELECT [GUID],filePath FROM @mockup;
OPEN cur;
FETCH NEXT FROM cur INTO @g,@fp;
WHILE @@FETCH_STATUS=0
BEGIN
    SET @cmd='INSERT INTO StagingFileContent([GUID],FileContent) ' +
             'SELECT ''' + @g + ''',* FROM OPENROWSET(BULK ''' +  @fp + ''', SINGLE_BLOB) AS Contents;'
    EXEC(@cmd);
FETCH NEXT FROM cur INTO @g,@fp;
END 
CLOSE cur;
DEALLOCATE cur;

--Now the staging table holds the GUID and the content as VARBINARY(MAX)

SELECT * FROM StagingFileContent;

Hint:

This might be depending on the encoding of your files. You can try SINGLE_CLOB and using VARCHAR(MAX) or NVARCHAR(MAX) instead of VARBINARY(MAX)

Part 2: Read the XMLs

Try this, the table is simulating your staging table:

DECLARE @xmls TABLE([GUID] VARCHAR(100),FileContent VARBINARY(MAX));
INSERT INTO @xmls VALUES
('0001',CAST(N'<form>
              <field>
                <name>TextboxAllocation3</name>
                <value>0</value>
              </field>
              <field>
                <name>TextboxAllocation1</name>
                <value>0</value>
              </field>
              <field>
                <name>TextboxAllocation2</name>
                <value>0</value>
              </field>
            </form>' AS VARBINARY(MAX)))
,('0002',CAST(N'<form>
              <field>
                <name>SomeMore</name>
                <value>1</value>
              </field>
              <field>
                <name>EvenMore</name>
                <value>2</value>
              </field>
            </form>' AS VARBINARY(MAX)));
WITH Casted AS
(
    SELECT [GUID],CAST(CAST(FileContent AS NVARCHAR(MAX)) AS XML) AS YourXML
    FROM @xmls
)
SELECT [GUID]
      ,f.value(N'(name/text())[1]','nvarchar(max)') AS FieldName
      ,f.value(N'(value/text())[1]','nvarchar(max)') AS FieldValue
FROM Casted
CROSS APPLY YourXML.nodes(N'/form/field') AS A(f);

The result:

GUID    FieldName          FieldValue
0001    TextboxAllocation3  0
0001    TextboxAllocation1  0
0001    TextboxAllocation2  0
0002    SomeMore            1
0002    EvenMore            2
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • Thank you for your help. I'm going to try this out this morning in-between meetings. With this method, I'll have to read the same information twice (once to know which XML files to pull, once to bind it to the rest of the data), but it's by far the most thought out way of doing it. +1 for now, will mark as answer after I verify it all works out! – Matt H Jun 12 '17 at 15:01
  • Thanks again! Using the method to store the information in a staging table from the dynamic SQL and cursor was a good idea. I used a global temp table instead, so I could deallocate it when necessary without messing with the database schema/structure (I don't want to void warranties with adding tables, views, sp's, etc.). This change is now live, and I really appreciate your help getting it there! – Matt H Jun 12 '17 at 16:34
0

You can query the xml as below: Not sure what you are trying to do with GUID.. If you have that column in different table you can do cross apply to below query and get xml's output:

declare @x xml = '<form>
  <field>
    <name>TextboxAllocation3</name>
    <value>0</value>
  </field>
  <field>
    <name>TextboxAllocation1</name>
    <value>0</value>
  </field>
  <field>
    <name>TextboxAllocation2</name>
    <value>0</value>
  </field>
</form>'

Select  T.c.value('name[1]', 'varchar(20)') as [Name],
        T.c.value('value[1]','int') as [Value]
from @x.nodes('form/field') T(c)
Kannan Kandasamy
  • 13,405
  • 3
  • 25
  • 38
  • Thank you, Kannan. I neglected to mention that I do know the syntax to grab the information from xml. My struggle is pulling the xml data from a potentially many files on the drive, selecting the values, and returning the information all in one query/result set. The GUID is there to indicate unique rows, and when I explained the question, it was mentioned that each GUID would have its own .xml file on the drive. – Matt H Jun 12 '17 at 15:03