2

Today I have a new situation, in which out client asks the following:

We have XML files in a folder. We need to load these files into a SQL Server table column (which has XML data type); we are NOT loading the XML outputs into various SQL Server tables, rather we are loading the XML file itself into a column, which has XML data type, in SQL Server database.

This ought to be done in SSIS only, as per the client requirement. So I use an Execute SQL Task in this manner. (Apart from the XML file, we need the ImportDate, FileName, etc.)

The Connection Type is OLE DB; the SQL Statement I have is as below:

INSERT INTO dbo.tablename (IncomingXMLfile, ImportDate)
-- I am using just 2 columns as an example for this table--
SELECT  CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
FROM OPENROWSET(BULK 'C:\Mic\...........\API_Schemas\ABC.xml', SINGLE_BLOB) AS x;

This works fine when I hardcode with just one file. Now, in the C:\Mic\...\API_Schemas folder, there are many XML files, I need to load each one of them, and apart from that get their ImportDate (which is GETDATE()), and the name of the file itself (I have not mentioned this column for now in the INSERT statement). I have to use a ForEachLoop Container, and place the Execute SQL Task inside this Container.

Hence there are 2 aspects to this issue:

  1. parameterizing the file name inside the Execute SQL Task.

  2. using ForEachLoop Container (and placing this Execute SQL Task inside) to run each XML file in the folder.

I use a user variable called Filename (string datatype), which has the value ABC.xml (the first file in the series of XML files in the folder); I parameterize the above TSQL query in this manner (see below):

INSERT INTO dbo.tablename (IncomingXMLfile, ImportDate)
SELECT  CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
FROM OPENROWSET(BULK 'C:\Mic\...........\API_Schemas\?', SINGLE_BLOB) AS x;

I map this parameter (?) to the user variable Filename in this Execute SQL Task (ParameterMapping tab). The query parsed correctly in the Execute SQL Task! However when I execute this SQL Task, I get this error:

"Parameter name is unrecognized.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

This situation is fairly different when I load the data from the XML files (using XML Source Editor, with the Data access mode (XML Data from variable) , etc.) into SQL Server tables. In our case, we are loading the XML file itself into a SQL Server table COLUMN which has XML data type, and get information about the various XML files loaded from the folder. As a result I am not sure on how to use a variable to run this package.

  • Can anyone first of all help me on how to parameterize the filename correctly?

  • And later on how to configure the ForEachLoop Container to read each file from the folder?

I am confused on how to use user variables in this scenario.

gofr1
  • 15,741
  • 11
  • 42
  • 52
user3812887
  • 439
  • 12
  • 33
  • The most straight way is to use dynamic SQL in a while loop. Get list of files from folder into temp table, take first, generate query run it go to second... etc. Or even generate one big statement and run it at once. – gofr1 Oct 06 '16 at 05:52
  • Can you provide me an example on how to use Dynamic SQL in such a scenario ? I have used Dynamic SQL in various circumstances, but not in a case like this. – user3812887 Oct 06 '16 at 06:57
  • I add solution with dynamic SQL as an answer. – gofr1 Oct 06 '16 at 07:18

1 Answers1

1

At first I create a table to store XML:

CREATE TABLE XMLstore (
    IncomingXMLfile xml,
    ImportDate datetime
)

Create 2 files for test purpose ABC.xml and DEF.xml and put in them some XML content:

<some>
    <row id="1">
        <stuff>1</stuff>
    </row>
</some>

Then run this script:

DECLARE @command varchar(1000),
        @dir varchar(max) = 'D:\API_Schemas\',
        @n int = 0,
        @i int = 1,
        @sql nvarchar(max)

DECLARE @files TABLE (
    id INT IDENTITY(1,1),
    files varchar(1000)
)

SET @command = 'dir "'+ @dir +'" /B'

INSERT INTO @files (files)
EXEC xp_cmdshell @command

SELECT @n = COUNT(*) 
FROM @files
WHERE files like '%.xml%'

WHILE @n >= @i
BEGIN

    SELECT @sql = N'
    INSERT INTO XMLstore (IncomingXMLfile, ImportDate)
    SELECT  CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
    FROM OPENROWSET(BULK '''+@dir+files+''', SINGLE_BLOB) AS x;'
    FROM @files
    WHERE id = @i

    EXEC sp_executesql @sql

    SET @i = @i + 1
END

After that I select from table XMLstore:

SELECT *
FROM XMLstore

And get output:

IncomingXMLfile                                 ImportDate
<some><row id="1"><stuff>1</stuff></row></some> 2016-10-06 10:17:41.453
<some><row id="2"><stuff>2</stuff></row></some> 2016-10-06 10:17:41.457

Description:

Files are stored here D:\API_Schemas\. I use xp_cmdshell to run command dir "D:\API_Schemas\" /B to get all files from that directory.

/B is used for enabling

Uses bare format (no heading information or summary)

so we get only filenames. And put them into @files table.

This table got identity column which add id started from 1 to each row (file). So we can iterate throw while loop using simple counter (@i).

In while loop we create a dynamic SQL query and run it.

Note:

Instead of xp_cmdshell you can use xp_dirtree (it is undocumented and unsupported) like:

DECLARE @dir varchar(100) = 'D:\API_Schemas\'

DECLARE @files TABLE (
    id INT IDENTITY(1,1),
    files varchar(1000),
    depth int,
    [file]  int
)

INSERT INTO @files 
EXEC xp_dirtree @dir, 1, 1
gofr1
  • 15,741
  • 11
  • 42
  • 52