0

I'd like to change this XML to SQL statement below to work with a path rather than a file name, at the moment it's looking for 'C:\Test\XML\PT38.xml'.

I need it to parse any .XML file in the folder rather than look for a specific one. It will be just one file at a time but they'll have different names (number increase: PT39, PT40, etc.).

I tried adding a variable for the path then changing the BULK to look for the variable, but it failed as expected.

I've read something about creating a temporary table then parse the date, but I'm not sure that would work for me.

I'd appreciate the help.

This is what I tried:

DECLARE @xmlFileName varchar(100) = 'C:\Test\XML\'
FROM OPENROWSET(BULK ''' + @xmlFileName + ''', SINGLE_BLOB) AS T(MY_XML)) AS T(MY_XML)

This is the XML content:

<?xml version="1.0" encoding="UTF-8" ?>
<MOD1>
   <DC BEGIN="1">
      <DC4 SEGMENT="1">
         <TABNAM>DC4</TABNAM>
         <DOCNUM>0000003899888135</DOCNUM>
      </DC4>
      <ZPR SEGMENT="1">
         <AUFNR>000915229446</AUFNR>
         <LNO>RM01PL01</LNO>
         <CHARG>0006186588</CHARG>
         <STR2>211609</STR2>
         <QTY>4166.000</QTY>
         <PLN_ORDER>6963701111</PLN_ORDER>
      </ZPR>
   </DC>
</MOD1>

This is the SQL table:

CREATE TABLE XMLTESTTABLE
(
    PON int, 
    ASP int, 
    LTN varchar(11),   
    GAS int, 
    QY varchar(15), 
    LNO varchar(2), 
    StartTime date,
);

This is the statement:

INSERT INTO XMLTESTTABLE(PON, ASP, LTN, GAS, QY, LNO, StartTime)
SELECT ZPRM.value('(AUFNR/text())[1]', 'int')
    , ZPRM.value('(CHARG/text())[1]', 'int')
    , ZPRM.value('(PLN_ORDER/text())[1]', 'VARCHAR(10)')
    , ZPRM.value('(CHARG/text())[1]', 'int')
    , ZPRM.value('(QTY/text())[1]', 'DECIMAL(10,0)') AS [qty]
    , RIGHT(ZPRM.value('(LNO/text())[1]', 'VARCHAR(10)'), 2) AS [LNO]
    , TRY_CAST(STUFF(STUFF(ZPRM.value('(STR2/text())[1]', 'CHAR(6)'),3,0,':'),6,0,':') AS TIME)
FROM (SELECT TRY_CAST(MY_XML AS xml)
      FROM OPENROWSET(BULK 'C:\Test\XML\PT38.xml', SINGLE_BLOB) AS T(MY_XML)) AS T(MY_XML)
      CROSS APPLY MY_XML.nodes('/MOD1/DC/ZPR') AS MY_XML(ZPRM);
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21
  • What error did you get? – Aaron Reese Mar 02 '22 at 14:34
  • Cannot bulk load. The file "'@xmlFileName'" does not exist. – KnockOutMan2015 Mar 02 '22 at 15:02
  • 1
    Openrowset requires a string literal - you have to use dynamic sql (which you tagged). I suggest you try to implement this in pieces since the entire query will be difficult to write in one attempt with all the embedded quotation marks. A bit odd that you store a time in a datetime column. [A starting point to using a dynamic file name](https://stackoverflow.com/questions/39827982/how-to-use-a-variable-in-openrowset-command) – SMor Mar 02 '22 at 15:03
  • Looks like `nodes('/MOD1/DC/ZPRM')` should be `nodes('/MOD1/DC/ZPR')`. What exactly are you asking? There are two steps you need to do here: get your data into SQL Server from files, and parse the XML into tables. The latter you seem to have done already. The former can only be done with `OPENROWSET` if you use dynamic SQL. You could do that, but I would strongly recommend you use Powershell, C# or Python to read the files instead. SQL Server is not good at that kind of thing – Charlieface Mar 02 '22 at 15:04
  • Or SSIS if you want to stay in the SQL stack. – Aaron Reese Mar 02 '22 at 15:04
  • @Charlieface , that's correct, it's a typo, sorry. I just want to get some of the data from the XML file and send it to the table, I don't need the whole XML content, sorry If I wasn't clear. – KnockOutMan2015 Mar 02 '22 at 15:13
  • Whatever. My point is that `OPENROWSET` with dynamic SQL is the only way to do this in pure T-SQL, which I think is the wrong tool for this job. – Charlieface Mar 02 '22 at 15:16
  • What is your SQL Server version? – Yitzhak Khabinsky Mar 02 '22 at 15:26
  • @YitzhakKhabinsky , testing on Express version 11.0.7462.6, Live is SQL 2012 11.0.7493.4 – KnockOutMan2015 Mar 02 '22 at 15:39

1 Answers1

0

It is much easier to implement in SQL Server 2017 and later. It has much better API to deal with the file system.

Please try the following solution. It will work in SQL Server 2012.

I modified the StartTime column data type as TIME(0).

You would need to modify @folder variable value to match what you have in your environment.

SQL

USE tempdb;
GO

DROP TABLE IF EXISTS dbo.XMLTESTTABLE;

CREATE TABLE dbo.XMLTESTTABLE
(
    PON varchar(10), 
    ASP int, 
    LTN varchar(11),   
    GAS int, 
    QY varchar(15), 
    LNO varchar(2), 
    StartTime TIME(0)
);

DECLARE @xml XML
   , @sql NVARCHAR(MAX)
   , @XMLfileName VARCHAR(256) -- 'e:\Temp\TradeFeed\PT38.xml';
   , @folder VARCHAR(256) = 'e:\Temp\TradeFeed';

DECLARE @tbl TABLE (
    id INT IDENTITY(1,1) PRIMARY KEY,
    [fileName] VARCHAR(512),
    depth INT,
    isfile BIT
);

INSERT INTO @tbl ([fileName], depth, isfile)
EXEC master.sys.xp_dirtree @folder,1,1;

-- just to see
SELECT * FROM @tbl;

-- filter out not need files
SELECT TOP(1) @XMLfileName = CONCAT(@folder, '\', [fileName])
FROM @tbl
WHERE isfile = 1
    AND [fileName] LIKE '%.xml';


SET @sql = N'SELECT @xmlOut = XmlDoc FROM OPENROWSET (BULK ' + QUOTENAME(@XMLfileName,NCHAR(39)) + ', SINGLE_BLOB) AS Tab(XmlDoc)';

EXEC master.sys.sp_executesql @sql, N'@xmlOut XML OUTPUT', @xmlOut = @xml OUTPUT;

INSERT INTO XMLTESTTABLE(PON, ASP, LTN, GAS, QY, LNO, StartTime)
SELECT @xml.value('(/MOD1/DC/DC4/TABNAM/text())[1]', 'VARCHAR(10)')
    , c.value('(CHARG/text())[1]', 'int')
    , c.value('(PLN_ORDER/text())[1]', 'VARCHAR(10)')
    , c.value('(CHARG/text())[1]', 'int')
    , c.value('(QTY/text())[1]', 'DECIMAL(10,0)') AS [qty]
    , RIGHT(c.value('(LNO/text())[1]', 'VARCHAR(10)'), 2) AS [LNO]
    , TRY_CAST(STUFF(STUFF(c.value('(STR2/text())[1]', 'CHAR(6)'),3,0,':'),6,0,':') AS TIME(0))
FROM @xml.nodes('/MOD1/DC/ZPR') AS t(c);

-- test
SELECT * FROM dbo.XMLTESTTABLE;

Output

+-----------+---------+------------+---------+------+-----+-----------+
|    PON    |   ASP   |    LTN     |   GAS   |  QY  | LNO | StartTime |
+-----------+---------+------------+---------+------+-----+-----------+
|    DC4    | 6186588 | 6963701111 | 6186588 | 4166 |  01 | 21:16:09  |
+-----------+---------+------------+---------+------+-----+-----------+
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21
  • `xp_dirtree` is undocumented, and I'd advise not relying on it. Any case, this is so much easier to do in Powershell. You could do it in a single line `Get-Content "*.xml" | Invoke-SqlCmd ...` and pass in the data as a parameter. – Charlieface Mar 03 '22 at 00:46
  • As I already mentioned, SQL Server 2017 and later has much better API to deal with the file system. I don't understand why people are using out of support SQL Server versions. – Yitzhak Khabinsky Mar 03 '22 at 00:52
  • Even the new ones (which are also only partially documented) should ideally not be used. It's just silly to expect SQL Server (which runs under a separate service account) to be able to manipulate files. That is the job of a generalized scripting language, which T-SQL is not. – Charlieface Mar 03 '22 at 01:29
  • This worked great for me, thanks for that! And thank you for the heads up, I have other things running on PowerShell that is related to this task so I'll try to get more done through it. – KnockOutMan2015 Mar 03 '22 at 12:04
  • @YitzhakKhabinsky If you don't mind me asking, I want to get data from /MOD1/DC/DC4 TABNUM and put it on PON (instead of from /MOD1/DC/ZPR AUFNR) where do I need to change? I tried adding a new INSERT INTO and FROM below but didn't get on well, ignored the whole of the first statement. I guess there's a much easier way. – KnockOutMan2015 Mar 03 '22 at 12:24
  • @KnockOutMan2015, I updated the answer by making two modifications: (1) modified PON column data type, (2) modified XQuery expression to get `DC4` value. – Yitzhak Khabinsky Mar 03 '22 at 13:39
  • @KnockOutMan2015, please connect with me on LinkedIn – Yitzhak Khabinsky Mar 03 '22 at 13:40