I have an excel file which I receive daily. The number of columns in that file is not specific. My requirement is just to load the last column in my table through SSIS. How will I be able to identify last used column dynamically?
Asked
Active
Viewed 817 times
3
-
is your problem solved? I think that importing excel files using SSIS has many issues and this is one of them so i am interested to see the right answer. Why not replying or accepting one of the answers provided. This helps the others seeking for a similar question – Yahfoufi Dec 12 '17 at 13:58
3 Answers
3
You can use c# script:
Make sure you add Using System.Data.OleDb; to the Namespaces Region and add output Column LastCol and select data type.
public override void CreateNewOutputRows()
{
/*
Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
*/
string fileName = @"C:\test.xlsx";
string SheetName = "Sheet1";
string cstr = "Provider.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";
OleDbConnection xlConn = new OleDbConnection(cstr);
xlConn.Open();
OleDbCommand xlCmd = xlConn.CreateCommand();
xlCmd.CommandText = "Select * from [" + SheetName + "]";
xlCmd.CommandType = CommandType.Text;
OleDbDataReader rdr = xlCmd.ExecuteReader();
int rowCt = 0; //Counter
while (rdr.Read())
{
//skip headers
if (rowCt != 0)
{
int maxCol = rdr.FieldCount;
Output0Buffer.AddRow();
Output0Buffer.LastCol = (int)rdr[maxCol];
}
rowCt++; //increment counter
}
}

KeithL
- 5,348
- 3
- 19
- 25
-
I know this is late, but someone just upvoted and made me look again. I should have included xlConn.Close(); Without it, the connection will remain open until the package completes and you cannot manipulate the file (i.e. move to a processed folder). – KeithL Feb 12 '18 at 15:23
1
Solution Overview
Use a Script Task to:
- Get the last column index
Use the following function to convert the index to Column Letter (ex: 1 -> A)
Private Function GetExcelColumnName(columnNumber As Integer) As String Dim dividend As Integer = columnNumber Dim columnName As String = String.Empty Dim modulo As Integer While dividend > 0 modulo = (dividend - 1) Mod 26 columnName = Convert.ToChar(65 + modulo).ToString() & columnName dividend = CInt((dividend - modulo) / 26) End While Return columnName End Function
Build the SQL Command that read only the last column
- Choose this query as Excel Source
Detailed Solution
This answer is assuming that the Sheet Name is Sheet1
, and the programming language used is VB.Net
- First create an SSIS variable of type string (i.e. @[User::strQuery])
- Add another variable that contains the Excel File Path (i.e. @[User::ExcelFilePath])
- Add A Script Task, and select
@[User::strQuery]
as ReadWrite Variable, and@[User::ExcelFilePath]
as ReadOnly Variable (in the script task window) - Set the Script Language to VB.Net and in the script editor window write the following script:
Note: you have to imports System.Data.OleDb
m_strExcelPath = Dts.Variables.Item("ExcelFilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim intLastColumn As Integer = 0
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
End Using
'Get the last Column Index
intLastColumn = dtTable.Columns.Count
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Try
Dim strColumnname as String = GetExcelColumnName(intLastColumn)
Dts.Variables.Item("strQuery").Value = "SELECT * FROM [" & strSheetname & strColumnname & ":" & strColumnname & "]"
Dts.TaskResult = ScriptResults.Success
End Sub
Private Function GetExcelColumnName(columnNumber As Integer) As String
Dim dividend As Integer = columnNumber
Dim columnName As String = String.Empty
Dim modulo As Integer
While dividend > 0
modulo = (dividend - 1) Mod 26
columnName = Convert.ToChar(65 + modulo).ToString() & columnName
dividend = CInt((dividend - modulo) / 26)
End While
Return columnName
End Function
- Then you have to add an Excel connection manager, and choose the excel file that you want to import (just select a sample to define the metadata for the first time only)
- Assign a default value of
Select * from [Sheet1$]
to the variable@[User::strQuery]
- In the Data Flow Task add an Excel Source, choose SQL Command from variable, and select
@[User::strQuery]
- Set the DataFlow Task
Delay Validation
property toTrue
- Add other components to DataFlow Task
References

Hadi
- 36,233
- 13
- 65
- 124
0
No you can't do that. The number of columns and data types must be determined beforehand and can't be changed. Otherwise SSIS will fail. So no way to dynamically get the last column. Workaround is maybe get the last column from inside excel using some macro then use that as the source for SSIS.

thotwielder
- 1,563
- 7
- 44
- 83
-
This is not true at all, SSIS has many option that give dynamicity to the Data Flow, there are many methods that can achieve this. Just take a look at my answer and KeithL answer. And by the way, i am not the one who downvoted you question. – Hadi Dec 08 '17 at 20:44