3

I am exporting Data from SQL Server to Excel, utilizing SSIS Data Flow Task. Here all columns appear as Text despite export formatting. Hence I need to develop a SSIS Script task to do the necessary conversion. I am facing trouble in developing the script.

Excel Workbook before Formatting

Excel Workbook Before Formatting See, the Excel Cell has no Apostrophe and the Number type is also "General" but the message says The number in this cell is formatted as text or preceded by an apostrophe

I have Tried different options available in the internet, but unsuccessfully.

#region Namespaces
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Runtime.InteropServices;
using Excel = Microsoft.Office.Interop.Excel;
#endregion

namespace ST_de899f405b7b4083b0ad8cba6b3df2e3
{

[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
    public void Main()
    {
        string inputFile = (string)Dts.Variables["Target_FullFilePath"].Value;
        Excel.Application ExcelApp = new Excel.Application();
        Excel.Workbook ExcelWorkbook = ExcelApp.Workbooks.Open(inputFile);
        Excel.Range formatRange;
        ExcelApp.Visible = true;

        foreach (Excel.Worksheet ExcelWorksheet in ExcelWorkbook.Sheets)
        {
            ExcelWorksheet.Select(Type.Missing);
            ExcelWorksheet.Columns[2].NumberFormat = "";
            ExcelWorksheet.Columns[3].NumberFormat = "";
            ExcelWorksheet.Columns[4].NumberFormat = "0.00000";
            ExcelWorksheet.Columns[5].NumberFormat = "yyyy-MM-dd";
        }

        ExcelWorkbook.Save();

        GC.Collect();
        GC.WaitForPendingFinalizers();

        ExcelWorkbook.Close(Type.Missing, Type.Missing, Type.Missing);
        Marshal.FinalReleaseComObject(ExcelWorkbook);

        ExcelApp.Quit();
        Marshal.FinalReleaseComObject(ExcelApp);
    }   
    enum ScriptResults
    {
        Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
        Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };
    #endregion
}
}

Expected Result: Columns numbered B, C, D to look like decimal/integer numbers and also similarly filtered. Column E to look like Date and also similarly filtered.

This is how I want Excel file to look like, after formatting through SSIS

What I want through Script in SSIS

I confirm the corresponding columns have relevant values only except column header.

Pradyumna
  • 43
  • 1
  • 7
  • I will edit your question and mention that you are exporting to Excel. Is CSV (text file) an option? You should also describe your actual issue - why is it unsuccessful? – Nick.Mc Aug 15 '19 at 04:41
  • Nick, csv/text file is not an option... Although I am okay to export in CSV/TXT and convert it to Excel, if that solves my need.... Thanks. – Pradyumna Aug 20 '19 at 10:50

1 Answers1

1

Before providing the solution, i have to explain some points about Excel Number Format

What is Number Format property?

Referring to Number format codes documentation:

You can use number formats to change the appearance of numbers, including dates and times, without changing the actual number. The number format does not affect the cell value that Excel uses to perform calculations. The actual value is displayed in the formula bar.

What is General Number format?

Referring to Reset a number to the General format documentation:

The General format is the default number format that Excel applies when you type a number. For the most part, numbers that are formatted with the General format are displayed just the way that you type them.

How Date are stored in Excel?

Referring to How Dates Work in Excel:

The dates in Excel are actually stored as numbers, and then formatted to display the date.

Your excepted result

You mentioned that:

Expected Result: Columns numbered 16, 17, 22 to be converted to "General" and look like decimal numbers. Column 31 to be converted to "General" and look like Date.

Based on what we mentioned you cannot convert column 31 to "General" and make it look like Date.

Solution

You just need to set NumberFormat property to an empty string to set it as "General"

ExcelWorksheet.Columns[16].NumberFormat = "";

Experiments

I Created an Excel file with 4 columns: NumberColumn, DateColumn, DecimalColumn and StringColumn as shown in the image above:

enter image description here

I created a console application with the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
using Excel = Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string inputFile = @"D:\Test.xlsx";

            Excel.Application ExcelApp = new Excel.Application();
            Excel.Workbook ExcelWorkbook = ExcelApp.Workbooks.Open(inputFile);
            ExcelApp.Visible = true;

            foreach (Excel.Worksheet ExcelWorksheet in ExcelWorkbook.Sheets)
            {
                ExcelWorksheet.Select(Type.Missing);

                ExcelWorksheet.Columns[1].NumberFormat = "";
                ExcelWorksheet.Columns[2].NumberFormat = "yyyy-MM-dd"; // convert format to date
                ExcelWorksheet.Columns[2].NumberFormat = "";
                ExcelWorksheet.Columns[3].NumberFormat = "0.00000"; // convert format to decimal with 5 decimal digits
                ExcelWorksheet.Columns[3].NumberFormat = "";
                ExcelWorksheet.Columns[4].NumberFormat = "";


            }
            ExcelWorkbook.Save();

            GC.Collect();
            GC.WaitForPendingFinalizers();

            ExcelWorkbook.Close(Type.Missing, Type.Missing, Type.Missing);
            Marshal.FinalReleaseComObject(ExcelWorkbook);

            ExcelApp.Quit();
            Marshal.FinalReleaseComObject(ExcelApp);
        }
    }
}

After executing the application, the Excel looked like the following:

enter image description here

Discussion and Conclusion

From the image above, we can see that all columns are changed to General Number format, but if values are stored as numbers they will be shown as they are stored: Date values are shown as Excel serials (numbers), decimal values are shown with only one decimal digit, even if we changed the format to five digits before resetting the format to General.

In Brief, you cannot handle how the values are shown when the Number Format is "General", if you need to show values as dates you have to set the number format to yyyy-MM-dd or any other date format.

Reference


Update 1

Instead of using ExcelWorksheet.Columns[1].NumberFormat, try using the following code:

ExcelWorksheet.Cells[1,1].EntireColumn.NumberFormat = "";
ExcelWorksheet.Cells[1,2].EntireColumn.NumberFormat = "";
Hadi
  • 36,233
  • 13
  • 65
  • 124
  • Thanks a lot @Hadi for taking your time out for my issues. But, I am sorry, it still didn't solve the problem. Initially the data in Excel is "Column Formatted as text".. If you move your cursor to the left top corner of a cell, it shows, "The number in this cell is **formatted as text** or preceded by an apostrophe." .. This is there despite the **Number Format is General.** I utilised your code. This changed the column type as you mentioned but **the number in this cell still remains formatted as text.** . Please can you suggest further. – Pradyumna Aug 19 '19 at 10:11
  • I am not sure how to add screenshots here for further reference. – Pradyumna Aug 19 '19 at 10:15
  • Since you mentioned that `the text is preceded by an apostrophe`, then you have to remove it because it forces the text format even Number format property is not text. If you need some assistance to write the code just give me a reply. Or just replace it from excel if no need to automate this step – Hadi Aug 19 '19 at 19:21
  • No, the text has no apostrophe, its only the number in that column which was formatted to text... And this is what I want to convert. My quote above, is the Msg displayed in Excel, against a particular cell, when a _Text to Column_ operation is done on it. To tell you the back gound.. I am extracting data from SQL Server. The data even numbers and date is saved as Text in the Excel.. To convert them, I need to perform a Text to column operation in Excel and save it as General.. Manually... I am trying to do this operation though SSIS – Pradyumna Aug 20 '19 at 10:22
  • I have updated the Question with screenshots from Excel "Existing" and "Required". Please review... Thanks a lot, for your time. – Pradyumna Aug 20 '19 at 10:47
  • I changed the Code.. in my case probably it should be Cells[2,1]... instead.. But i did for Cells[1,2],Cells[1,3].... Cells[1,5], Cells[2,2], Cells[2,3]... Cells[2,5]... But no luck.. nothing changes... – Pradyumna Aug 20 '19 at 12:27
  • `Cells[2,1]` it means the second row the first column `[row index,column index]`. I think there is another solution to the used range and paste it using `Interop.Excel` which will force formatting – Hadi Aug 20 '19 at 12:41
  • @Pradyumna I cannot post an update right-now maybe I will update my answer after some hour – Hadi Aug 20 '19 at 12:42
  • Script doesnt work for me, in ssis its saying script has an error. Also the line Excel.Range formatRange; is saying its specified but never used. – Stew Mar 05 '20 at 09:45
  • @Stew `Excel.Range formatRange;` is useless and I remove it from the script, but what is the script error? Make sure you added `Interop.Excel` assembly as a reference. – Hadi Mar 05 '20 at 19:55