0

I'm having to write a program that reads from an Excel file and places it into a DataTable to export to another Excel file with changed column headers. The data within the columns should stay the same. I'm currently able to get it to the DataTable, although, I have a ton of information missing (individual cells are left empty/null).

Here is what my Excel file looks like:

+----------------------------------------------+
|  ID   |  AccNum  | CustName  | City  | State | 
+----------------------------------------------+
| 02345 |  065812  | CustName1 | City1 |  KS   |
| 02346 |  087425  | CustName2 | City2 |  MO   |
| 02347 |  054785  | CustName3 | City3 |  KS   |
| ..... |  ......  | ......... | ..... |  ..   |
+----------------------------------------------+

Code so far for reading Excel to DataTable:

public DataTable ReadExcel(string fileName, string TableName)
{
    //Creates new DataTable for results
    DataTable table = new DataTable();
    //Creates connection string for Excel file
    OleDbConnection connection = new OleDbConnection(@"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0\"");
    //Creates command to be ran
    OleDbCommand cmd = new OleDbCommand("SELECT * FROM [" + TableName + "$]", connection);

    try
    {
        //Launches connection
        connection.Open();
        //Creates and Executes data reader
        OleDbDataReader reader = cmd.ExecuteReader();
        //Loop until the reader closes
        while (!reader.IsClosed)
        {
            //Load reader data to DataTable
            table.Load(reader);
        }
    }
    catch (OleDbException ode)
    {
        Console.WriteLine("ERROR: " + ode.Message);
    }
    finally
    {
        //No matter what- close the connection
        connection.Close();
    }


    //Return the DataTable
    return table;
}

My expected output is to match the sample Excel file format. However, I'm getting the following:

+----------------------------------------------+
|  ID   |  AccNum  | CustName  | City  | State | 
+----------------------------------------------+
|  2345 |   65812  | CustName1 | City1 |  KS   |
|       |   87425  | CustName2 | City2 |  MO   |
|  2347 |   54785  | CustName3 | City3 |  KS   |
| ..... |  ......  | ......... | ..... |  ..   |
+----------------------------------------------+

The only data missing is the ID number along with some leading zeros that are wanting to be kept. With this sample data, for the sake of the length of the post, I've only supplied the three customer samples. With more data, there are more missing ID numbers in no particular order.

Am I grabbing the data incorrectly? Is there something I'm missing? I will gladly answer any questions. Thanks in advance.

Jaskier
  • 1,075
  • 1
  • 10
  • 33
  • Probably not the sole issue, but why are you looping with `while (!reader.IsClosed)`? You only need to call `DataTable.Load` once. – Zer0 Feb 21 '19 at 15:41
  • @Zer0 , mainly because I didn't know I only needed to call it once. Thanks for the suggestion, I'll give it a go! – Jaskier Feb 21 '19 at 15:42
  • 1
    Verify that the source data is correctly formatted. I suspect that it's skipping cells where it can't implicitly convert into the expected type. – Chris Pickford Feb 21 '19 at 15:43
  • @ChrisPickford , changing the format seems to get the expected data output (although, the leading zeros are still gone)! Though, it doesn't make sense why it would skip certain cells within the same column, especially when the entire column has the same formatting? – Jaskier Feb 21 '19 at 15:48
  • The values it's skipping cannot be implicitly converted. Find one of the missing cells in the source data and look at **exactly** what the value is. Leading zeros are probably being removed because it's converting to a numeric data type. – Chris Pickford Feb 21 '19 at 15:55
  • @ChrisPickford Oh holy moly. I guess I never saw that. All the values being skipped are being stored as text while the rest of the column is stored as a number. Is there a way to get around this (from my C# code) without manually editing the Excel document? Ideally, I only want to select the file and not have to open it – Jaskier Feb 21 '19 at 16:01
  • Sure, you'll need to ditch the `table.Load` in favour of manually constructing your DataTable. Iterate over each row and treat each cell as raw text. Parse into target data type explicitly. Alternatively, just use [SSIS](https://learn.microsoft.com/en-us/sql/integration-services/sql-server-integration-services) – Chris Pickford Feb 21 '19 at 16:05
  • 1
    OLEDb typically makes a best guess of the data type based on the first 8 rows of data - this can be overridden by the connection string or registry see if these links help with mixed data types. https://stackoverflow.com/questions/9918279/mixed-data-type-in-excel-imex-doesnt-work & https://stackoverflow.com/questions/3232281/oledb-mixed-excel-datatypes-missing-data – PaulF Feb 21 '19 at 16:09
  • @ChrisPickford , Thank you for the suggestions! I'll try to see what I can come up with. Not sure if SSIS will be of much use as I'm only going to be importing the finished Excel file to an MS Access database – Jaskier Feb 21 '19 at 16:46
  • @PaulF , Thanks for the resources! I'll give them a go – Jaskier Feb 21 '19 at 16:46

0 Answers0