0

There are many aspects of what I want to do but I think learning one piece will let me derive the rest.

I have an SSIS package that uses powershell to download a publicly available zip file, an execute script to unzip with 7zip and then data flows to load the unzipped files to corresponding tables.

What I want to do is add the file name (and eventually other aspects of the file like creation date, record counts and so on) from any one of the unzipped files to a log table that keeps track of the summary level details of the files.

How do I dynamically store this type of information as part of the package? Derived columns? But what's the input? Thanks!

Tim F
  • 85
  • 1
  • 11
  • 1
    How are you identifying the file name at the moment? There are many examples online of using the For Each File container to iterate through files and save their filenames. If you also want other file attributes then It might be easier to just build it_all_ in powershell. – Nick.Mc Nov 27 '18 at 00:56
  • It's identified, I guess, by virtue of being in the package? Is there not a way to systematically store and then retrieve later in the same package that file 1 of 4 was named xyz.txt and had 100 rows? – Tim F Nov 27 '18 at 03:51
  • There is most certainly a way to systematically store filenames and attributes. How does your package _right now_ know that it has to load file xyz.txt? Is it hard coded into that package? Have a read of this https://stackoverflow.com/questions/38151342/ssis-how-to-loop-through-files-in-folder-and-get-pathfile-names-and-finally-e – Nick.Mc Nov 27 '18 at 04:10
  • OK, I'm using a data flow task type of "flat file source" that uses a connection manager which looks for one of the four files I just unzipped earlier in the package. I'll read that note you posted right now. – Tim F Nov 27 '18 at 16:26
  • that link above was helpful but not quite what I was looking for. I'm really just wanting to store attributes of the files in a sort of batch log. – Tim F Nov 27 '18 at 18:26

3 Answers3

0

you can use Variable to store File name when your loop the files, and after file been loaded to table, then u can use current file name to insert/update log table.

BeiBei ZHU
  • 343
  • 2
  • 12
0

There are many options for dynamically working with files through SSIS. Below is an overview of one method. Of course this can vary, depending on your specific needs and requirements.

  • Add a Foreach Loop Container. On the Collection pane, the Folder property can either be set using the GUI as well as through a parameter or variable with the Directory expression. Searching sub folders can also be set by checking the "Traverse subfolders" checkbox or using the Recurse expression like the Folder field.
  • The Files field will indicate the files to use and wildcards can be used. * will match any number of characters. For example, *.csv will get all csv files regardless of name and Test*.txt will return all .txt files with names beginning Test, regardless of how many or which characters follow. To limit this to a single character, use ?. The FileSpec expression will allow this to be set dynamically similar to the directory by variable or parameter.
  • The Variable Mappings pane will allow for setting a variable to hold a file name from the directory. Add a variable that will hold the file name to index 0 to map these.
  • You indicated that you wanted to store the file name. The detail of this can be controlled from the "Retrieve file name" field on the Collection window. As their names imply, Fully Qualified will hold the complete file path, Name and Extension will return the file name with extension, and Name Only is just the file name.
  • As for other aspects of the file, I'd recommend a using a Script Task for this for more functionality. The C# FileInfo class provides options for finding details about the file such as the creation date, last time the file was accessed, and when the file was most recently written to. Additonal information on this can be found here.
  • For the record counts from the file, you'll need to create a Connection Manager for this and work with the data within the package. I'm assuming these are flat files? If so, creating a Flat File Connection Manager, and setting the same variable from the Variable Mappings pane of the Foreach Loop to the ConnectionString expression of the Connection Manager will allow you to dynamically loop through each file. Make sure that the Fully Qualified option is used for the "Retrieve file name" field as earlier if you decide to do this. You will also want to configure the correct columns and data types for the Connection Manager ahead of time. This same process can be followed for Excel files, however the variable with the file name will be used on the ExcelFilePath expression instead.
  • As for storing information about a file in a log table, there are a multitude of options for these. A very basic example of an Insert statement within an Execute SQL Task that's placed within the Foreach Loop is below. The 3 part table name is only necessary if you're using a table that differs from the initial catalog of the Connection Manager. The ? is the parameter marker (assuming this is an OLE DB connection). After this, map the same variable/parameter that stores the file name using the Parameter Mapping pane. Set the direction to Input, appropriate data type (likely VARCHAR/NVARCHAR), 0 in the Parameter Name field to indicate this is the first parameter in the SQL statement (additional ? can be used for subsequent parameters in the SQL statement, then increment this field in accordance), and the default Parameter Size can be left at -1. Again, this is a simple example and you'll probably want store more about the files and their contents, but this can get you started.

Sample SQL Insert:

INSERT INTO YourDataBase.YourSchema.YourTable (ColumnToHoldFileName)
VALUES (?)
userfl89
  • 4,610
  • 1
  • 9
  • 17
  • somehow I'm not getting how to pass a stored variable to the INSERT statement. What's the specific syntax? Just so I can learn this one step, I want to simply insert my machine name into an empty field in a table. Within an Execute SQL Task Editor I choose "System::MachineName", direction Input, data type long, default parameter name, parameter size -1. How do I then pass that to a field within my insert statement? – Tim F Nov 27 '18 at 19:43
  • This is an OLE DB connection right? If so, use the same syntax as the insert statement above. @[System::MachineName] is a string variable, so change the data type to either VARCHAR or NVARCHAR (if the destination column is Unicode). If this is the only parameter that's being used, the parameter name will also need to be set to 0. – userfl89 Nov 27 '18 at 19:55
  • yeah I didn't realize the variable options weren't quite right, that did the trick! – Tim F Nov 27 '18 at 20:13
0

figured it out from looking at other posts. I had to expand the parameter size...easy fix!

Tim F
  • 85
  • 1
  • 11