11

I use Epplus to reading xlsx files from stream.

It has a bug , it cant read some columns in my workbook.How can read xlsx files from stream to datatable without epplus ?

my older code:

 public static DataSet ReadExcelFile(Stream stream)
    {
        try
        {
            //2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
            IExcelDataReader excelReader =    
                             ExcelReaderFactory.CreateOpenXmlReader(stream);
            //...
            DataSet result = excelReader.AsDataSet();

            return result;

        }
        catch (Exception x)
        {
            throw x;
        }
    }

I didnt report it, but i tried so much combinations.If there are empty columns in worksheet ,epplus reader cant read correctly column values.

Mennan
  • 4,451
  • 13
  • 54
  • 86

2 Answers2

18

"It has a bug , it cant read some columns in my workbook"

Can you describe the bug, have you reported it or is it already known, what version are you using?

Here's a simple approach to load an excel file into a DataTable with EPPlus.

public static DataTable getDataTableFromExcel(string path)
{
    using (var pck = new OfficeOpenXml.ExcelPackage())
    {
        using (var stream = File.OpenRead(path))
        {
            pck.Load(stream);
        }
        var ws = pck.Workbook.Worksheets.First();  
        DataTable tbl = new DataTable();
        bool hasHeader = true; // adjust it accordingly( i've mentioned that this is a simple approach)
        foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
        {
            tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
        }
        var startRow = hasHeader ? 2 : 1;
        for (var rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
        {
            var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
            var row = tbl.NewRow();
            foreach (var cell in wsRow)
            {
                row[cell.Start.Column - 1] = cell.Text;
            }
            tbl.Rows.Add(row);
        }
        return tbl;
    }
}
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • i got null object referance on var ws = pck.Workbook.Worksheets["Worksheet1"]; any idea ? – Mennan Jun 28 '12 at 07:38
  • @Mennan: Try `var ws = pck.Workbook.Worksheets.First();` instead (edited my answer accordingly). – Tim Schmelter Jun 28 '12 at 07:48
  • I use Epplus version 3.0.0.2 , and pck.Workbook.Worksheets has no First() method sorry , also i tried Worksheets[0] , still is null.I dont understand – Mennan Jun 28 '12 at 07:55
  • @Mennan: Then you're not using .NET framework >= 3.5. Try `var ws = pck.Workbook.Worksheets[1];` since index of `ExcelWorksheets` starts with 1. Can you open that excel file manually? – Tim Schmelter Jun 28 '12 at 08:01
  • 3
    Thanks for this, helped a lot. I had some issues regarding empty columns though, had to remove them from my excel for this to work. – Contra Sep 12 '12 at 11:29
  • @Contra thanks a lot, for removing empty columns tip. Workbook.Worksheets.First(); throws index out of exception when you have empty columns in worksheet. – emre nevayeshirazi Dec 20 '13 at 16:10
  • @Tim Thank you this is very useful for something I am working on. To contribute something the references needed are: using System.IO; using OfficeOpenXml; – JPK Mar 12 '14 at 08:39
  • This is working perfectly. Only one doubt. Can I set the sheet by name instead of `var ws = pck.Workbook.Worksheets[1];`. The file I want to parse is an everchanging daily report, somethimes they add a sheet to the beginning, sometimes they delete one, but the sheet names are constant. So is it possible? – fishmong3r Jul 21 '14 at 10:11
  • 2
    @fishmong3r: yes, you can use `FirstOrDefault` to get the first sheet with that name (or null if there's none): `var ws = pck.Workbook.Worksheets.FirstOrDefault(sheet => sheet.Name == "Name");` – Tim Schmelter Jul 21 '14 at 10:39
  • @TimSchmelter Many thanks. I know it's a different thing but can you help me how to start on including the EPPlus.dll? I did this: https://www.youtube.com/watch?v=x-KK7bmo1AM but I still get `Could not load file or assembly` – fishmong3r Jul 21 '14 at 10:46
  • @TimSchmelter, how to improve this to be able to keep the data type for the column? Your solution would convert everything to strings, right? – Gustav Aug 02 '16 at 13:23
  • 1
    @Gustav: i think that's difficult since excel is storing everything as string and numbers and datetimes as doubles. So you need a lot of conversion anyway. If you know the target columns/types you can do the conversion manually, for example with `DateTime.Parse` or `int.Parse`. – Tim Schmelter Aug 02 '16 at 13:29
  • @TimSchmelter Ok, I see, and that makes sense. Thanks for your very quick answer, much appreciated. – Gustav Aug 02 '16 at 13:38
0

This is way past, however it could still help someone. Apparently some columns in my worksheet were merged, so for example, if columns A and B are merged it only recognizes column A as the one with the value, and so it returns column B as empty, when i call on that particular cell's value(B). To get past this, make sure you know which cells are merged and then grab only the first one and regard the rest of the merged cells as null