2

I'm new here. I have a question in regards to SSIS and dynamic file names. I've found an article that can help for incrementing file names by appending dates, but I actually need to keep a rolling filename counter of 001-999 and start over at 001 for each time the file is created. Here's the article I reviewed:

https://www.red-gate.com/simple-talk/sql/ssis/passing-variables-to-and-from-an-ssis-task/

For instance, my first file would be "P12345001" where "001" is the first instance of the file. The next time the SSIS package runs, I'd want the filename to be "P12345002" and so on, until it gets to "P12345999" and the next file would go back to "P12345001".

Is this possible in SQL or SSIS or do I need to script something via a language like VB.NET or C# (which I only have limited experience with)?

If it helps, I am running the SSIS package on MS SQL Server 2012. Please let me know if I've included all pertinent information or not. Many thanks in advance!

jderekc
  • 115
  • 1
  • 12

1 Answers1

6

SQL Work

While this is syntactically correct for SQL Server, you can use the same concept with whatever RDBMS you like, just modify it as needed.

Data Storage

To solve this, you need some place of storing the previous run's value. It appears that a key value type of table would be sufficient for your needs.

-- Simple table structure
CREATE TABLE dbo.FileExecutionHistory
(
    BaseFileName varchar(50) NOT NULL
,   LastSequence int NOT NULL 
,   CONSTRAINT PK_dbo_FileExecutionHistory PRIMARY KEY
    (
        BaseFileName
    )
);

Into that you will store the name of the file (P12345) and the last sequence number used (1).

Data Access

Within SSIS, you will pass in the file name to this query. Note the use of ? - that will be the OLE DB provider's syntax for ordinal based query parameterization.

The inner query generates at most 2 rows. The last sequence used for this file name (if it is found) and 0 (not found). The outer query uses the TOP(1) argument to only allow one row to be generated and we order by the value descending - thus ensuring we only take the largest value.

We use the modulo % operator with 999 to restrict the allowable domain of values to 0 to 998.

However, since you want 1 to 999, we then add 1 to the resultant value.

-- Access pattern
SELECT TOP(1)
    (D.LastSequence % 999) + 1  AS LastSequence
FROM
(
    SELECT
        FEH.LastSequence
    FROM
        dbo.FileExecutionHistory AS FEH
    WHERE
        FEH.BaseFileName = ?
    -- Handle the does not exist case
    UNION
    SELECT
        CAST(0 AS int)
) D
ORDER BY
    D.LastSequence DESC;

Data Storage

Again, we will see the ? in play but this time we use it for both our file name and the sequence value.

Since the OLE DB provider is ordinal based, we would need to use four place holders and map in the "same" variable twice. That's silly so I declare a SQL Server variables up front and assign into them. For consistency, you can apply this to the Data Access pattern as well.

The Merge statement is clunky and uncertain in results so I use the tested and true update/insert pattern. I update the Value in my table for the file which will match nothing if it doesn't exist. I then attempt to insert a value into our table if it doesn't already exist.

Only one of those last two statements will ever perform an action so this is good. Plus, we have our primary key constraint preventing duplicates on file name so this is double plus good.

-- Storage pattern
DECLARE
    @BaseFileName varchar(50) = ?
,   @LastSequence int = ?;

UPDATE
    FEH
SET
    LastSequence = @LastSequence
FROM
    dbo.FileExecutionHistory AS FEH
WHERE
    FEH.BaseFileName = @BaseFileName;

INSERT INTO
    dbo.FileExecutionHistory
(
    BaseFileName
,   LastSequence
)
SELECT
    D.BaseFileName
,   D.LastSequence
FROM
(
    VALUES (@BaseFileName, @LastSequence)
) D (BaseFileName, LastSequence)
    LEFT OUTER JOIN
        dbo.FileExecutionHistory AS FEH
        ON FEH.BaseFileName = D.BaseFileName
WHERE
    FEH.BaseFileName IS NULL;

SSIS work

Break the work down into small pieces and use what SSIS gives you natively. In this approach, we're going to use Variables to do most of the heavy lifting.

Variables

You're looking at a package with a few obvious Variables in it. One Variable to hold your Sequence number. One Variable to hold the base file name. I assume the full file name is probably something like P12345001.txt so you'll want to keep track of that stuff too.

  • User::BaseFileName String -> P12345
  • User::SequenceNumber Int -> 0
  • User::FileExtension String -> .txt
  • User::CurrentFolder String -> C:\ssisdata\Output

  • User::SequencePad String > 001

  • User::FullFileName String > P12345001.txt
  • User::CurrentFileName String > C:\ssisdata\Output\P12345001.txt

The first 4 variables will be "normal". The last 3 will have Expressions in them. Expressions are simple formulas much as you'd see in Excel. Notice that I've assigned a value of 0 for the sequence number. That's not allowed in your scenario but it will be helpful for us to ensure all of the pieces of the puzzle are working as expected.

The expression for SequencePad will use a classic technique of padding N characters to the left of our value and then slicing the right N most characters out. For values like 999, there will be no net result. For 1, we'll end up with 001. Since we're dealing with a numeric value for SequenceNumber, we must first cast it to a string before concatenating + with the triple zeroes.

RIGHT("000" + (DT_WSTR, 3) @[User::SequenceNumber], 3)

FullFileName simply concatenates 3 variables together. Two "hard coded" values and the padded sequence number we built above.

@[User::BaseFileName] + @[User::SequencePad] + @[User::FileExtension]

Finally, CurrentFileName is built much in the same manner as FullFileName. We're going to concatenate more variables together. The reason for all these little steps is that this is the only way one has to debug Expressions. There is no ability to put break points on them. Thus, make 'em small and composite them together.

The only "trick" to be aware of here is that the slash \ is also an escape character in nearly every language so if you want to use \, you really need to use \\

@[User::CurrentFolder] + "\\" + @[User::FullFileName]

At this point, you can test putting different values in for the first 4 variables and ensure the expressed values look as expected. If you don't need a file extension, just leave that value blank - the expression will work just fine.

Flat File Connection Manager

Create a Flat File Connection manager based on the structure of your file P12345001.txt

Now that you have a variable defined, you will need to use this property as part of a Flat File Connection manager's Connection String property. See here and here for setting the Property. I'll call this Output

Ole DB Connection Manager

You need an OLE DB Connection Manager pointing to the SQL server and database we created the table in (way at the top). I'll assume it's called Config

Execute SQL 1

The purpose of this task is to assign a value into @[User::SequenceNumber]. Drag an Execute SQL Task onto the canvas and configure it much as I show in this answer. For our Connection, we'll use Config. The SQL Statement is the data access query I defined above. In the Result Set tab, we'll specify User::SequenceNumber

Run your package and if it doesn't blow up, we can assume things went well.

Data Flow Task

Somewhere, somehow you're using the computed file name and I assume you're going to query some table and push rows into that file. Your Data Flow will look something like an OLE DB Source component wired into a Flat File Destination. An example, with Biml none-the-less, is here

Execute SQL 2

Now that our data flow is complete, we need to update our original table with our used sequence number.

This will be another Execute SQL Task, using the third query from above, with No resultset defined. We will map in the variables @[User::FileName] and @[User::SequenceNumber] (in that order) as parameters 0 and 1.

Biml

The Business Intelligence Markup Lanaguage, Biml, can be thought of as the operating system for business intelligence. It has pieces in there to handle all aspects of what you need to describe to get the work done. Here, I'm using it to describe what an SSIS package would look like that has an auto increment file name. To make this work for you

  1. Download and install BimlExpress
  2. Open an SSIS project and add a new Biml file. Paste the contents below into it
  3. Edit line 11 and 12 to point to a valid SQL Server and folder on your computer
  4. Right click on BimlScript.biml and select Generate SSIS Package
  5. There should now be an SSIS package in your project called so_45355289.dtsx - run that sucker

Assuming all that works, you can either dig in an modify the supplied Biml to meet your needs (change the queries to actually extract the data you care about and make the flat file format match). Or you can just modify the existing SSIS package and forget about Biml for now. Eventually you'll probably care about it though because it's a huge time saver.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <FileFormats>
        <FlatFileFormat Name="FF Output">
            <Columns>
                <Column Name="DatabaseName" Length="128" Delimiter="Tab" />
                <Column Name="DatabaseId" DataType="Int32" Delimiter="CRLF" />
            </Columns>
        </FlatFileFormat>
    </FileFormats>
    <Connections>
        <OleDbConnection Name="Config" ConnectionString="Provider=SQLNCLI11;Server=localhost\DEV2016;Initial Catalog=tempdb;Integrated Security=SSPI;"  />
        <FlatFileConnection Name="Output" FileFormat="FF Output" FilePath="C:\ssisdata\output\P12345.txt"  />
    </Connections>
    <Packages>
        <Package Name="so_45355289" ConstraintMode="Linear">
            <Connections>
                <Connection ConnectionName="Output">
                    <Expressions>
                        <Expression ExternalProperty="ConnectionString">@[User::CurrentFileName]</Expression>
                    </Expressions>
                </Connection>
            </Connections>
            <Variables>
                <Variable Name="BaseFileName" DataType="String">P12345</Variable>
                <Variable Name="SequenceNumber" DataType="Int32">0</Variable>
                <Variable Name="FileExtension" DataType="String">.txt</Variable>
                <Variable Name="CurrentFolder" DataType="String">C:\ssisdata\Output</Variable>
                <Variable Name="SequencePad" DataType="String" EvaluateAsExpression="true">RIGHT("000" + (DT_WSTR, 3) @[User::SequenceNumber], 3)</Variable>
                <Variable Name="FullFileName" DataType="String" EvaluateAsExpression="true">@[User::BaseFileName] + @[User::SequencePad] + @[User::FileExtension]</Variable>
                <Variable Name="CurrentFileName" DataType="String" EvaluateAsExpression="true">@[User::CurrentFolder] + "\\" + @[User::FullFileName]</Variable>
            </Variables>
            <Tasks>
                <ExecuteSQL Name="SQL Get Current Sequence" ConnectionName="Config" ResultSet="SingleRow">
                    <DirectInput><![CDATA[SELECT TOP(1)
    (D.LastSequence % 999) + 1  AS LastSequence
FROM
(
    SELECT
        FEH.LastSequence
    FROM
        dbo.FileExecutionHistory AS FEH
    WHERE
        FEH.BaseFileName = ?
    -- Handle the does not exist case
    UNION
    SELECT
        CAST(0 AS int)
) D
ORDER BY
    D.LastSequence DESC;]]></DirectInput>
                    <Parameters>
                        <Parameter Name="0" DataType="String" Length="128" VariableName="User.BaseFileName" />
                    </Parameters>
                    <Results>
                        <Result Name="0" VariableName="User.SequenceNumber" />
                    </Results>
                </ExecuteSQL>
                <Dataflow Name="DFT Generate data">
                    <Transformations>
                        <OleDbSource Name="OLESRC Query" ConnectionName="Config">
                            <DirectInput>SELECT name AS DatabaseName, database_id AS DatabaseId FROM sys.databases;</DirectInput>
                        </OleDbSource>
                        <FlatFileDestination Name="FFDST Output" ConnectionName="Output" Overwrite="true">
                        </FlatFileDestination>
                    </Transformations>
                </Dataflow>
                <ExecuteSQL Name="SQL Store Sequence" ConnectionName="Config">
                    <DirectInput><![CDATA[-- Storage pattern
DECLARE
    @BaseFileName varchar(50) = ?
,   @LastSequence int = ?;

UPDATE
    FEH
SET
    LastSequence = @LastSequence
FROM
    dbo.FileExecutionHistory AS FEH
WHERE
    FEH.BaseFileName = @BaseFileName;

INSERT INTO
    dbo.FileExecutionHistory
(
    BaseFileName
,   LastSequence
)
SELECT
    D.BaseFileName
,   D.LastSequence
FROM
(
    VALUES (@BaseFileName, @LastSequence)
) D (BaseFileName, LastSequence)
    LEFT OUTER JOIN
        dbo.FileExecutionHistory AS FEH
        ON FEH.BaseFileName = D.BaseFileName
WHERE
    FEH.BaseFileName IS NULL;]]></DirectInput>
                    <Parameters>
                        <Parameter Name="0" DataType="String" Length="128" VariableName="User.BaseFileName" />
                        <Parameter Name="1" DataType="Int32" VariableName="User.SequenceNumber" />
                    </Parameters>
                </ExecuteSQL>
            </Tasks>
        </Package>
    </Packages>
</Biml>
billinkc
  • 59,250
  • 9
  • 102
  • 159
  • That's absolutely fantastic! Thanks so much for your hard work in putting this together. It will make my life a lot easier. Kudos. – jderekc Jul 28 '17 at 13:26