85

I know that there are different ways to read an Excel file:

  • Iterop
  • Oledb
  • Open Xml SDK

Compatibility is not a question because the program will be executed in a controlled environment.

My Requirement :
Read a file to a DataTable / CUstom Entities (I don't know how to make dynamic properties/fields to an object[column names will be variating in an Excel file])

Use DataTable/Custom Entities to perform some operations using its data.

Update DataTable with the results of the operations

Write it back to excel file.

Which would be simpler.

Also if possible advice me on custom Entities (adding properties/fields to an object dynamically)

shA.t
  • 16,580
  • 5
  • 54
  • 111
Ankesh
  • 4,847
  • 4
  • 38
  • 76
  • @AmiramKorach what abput writing back to excel.... – Ankesh Oct 21 '12 at 14:17
  • I use a commercial 3rd party for that. This have been asked here http://stackoverflow.com/questions/1527790/c-sharp-write-to-excel-using-linq – Amiram Korach Oct 22 '12 at 09:21
  • I think that one of the most efficient ways is with GemBox.Spreadsheet library which has direct methods for [export `DataTable` to _sheet_](https://www.gemboxsoftware.com/spreadsheet/examples/c-sharp-export-datatable-to-excel/501) and [exporting _sheet_ to `DataTable`](https://www.gemboxsoftware.com/spreadsheet/examples/c-sharp-export-excel-to-datatable/502). – Hazel Patton Dec 24 '19 at 10:40
  • Just a little helpful advice, excel files are just zip files. Extracting an excel file will leave you with a few folders. The strings for the file are stored in "[filenamefolder]/xl/sharedStrings.xml" and the workbooks are stored in "[filenamefolder]/xl/workbook.xml" Theoretically you could just unzip the excel file programmatically and extract information from the extracted files. – Zach Pedigo Aug 04 '20 at 14:19

8 Answers8

79

Take a look at Linq-to-Excel. It's pretty neat.

var book = new LinqToExcel.ExcelQueryFactory(@"File.xlsx");

var query =
    from row in book.Worksheet("Stock Entry")
    let item = new
    {
        Code = row["Code"].Cast<string>(),
        Supplier = row["Supplier"].Cast<string>(),
        Ref = row["Ref"].Cast<string>(),
    }
    where item.Supplier == "Walmart"
    select item;

It also allows for strongly-typed row access too.

John McDonald
  • 1,790
  • 13
  • 20
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • 6
    Note that Linq-to-Excel uses a list of third-party libs. – florians Jul 28 '14 at 14:14
  • 6
    @fschricker - Only two - "log4net" & "Remotion". – Enigmativity Jul 29 '14 at 01:11
  • Cool library. Thanks for letting us know about it. – shahar eldad Aug 01 '14 at 12:38
  • 13
    And the Access Database Engine, which although Microsoft is another dependency. – Alan B Mar 10 '15 at 12:37
  • 1
    @CodeIt - Yes, it does. – Enigmativity Sep 20 '17 at 23:30
  • 6
    While nice...this isn't a viable production asset because of the "Access Database Engine". – Prisoner ZERO Jul 12 '18 at 19:11
  • 1
    @PrisonerZERO - Yes, but the need to read an Excel file already puts one in the "non-viable production asset" space already. – Enigmativity Jul 13 '18 at 00:45
  • 14
    Not to be a 'Troll'...but...in case someone is thinking about using this. The only place you can "probably" use it is from your own desktop. No one is going to let you install the "Access Database Engine" on Client Desktops or Web Servers...and many places (for good reason) won't even let you install things like that locally. Again, I love the syntax & I love the idea...but this isn't a broadly viable solution. Still...very cool. – Prisoner ZERO Jul 13 '18 at 12:35
  • 1
    @PrisonerZERO At risk of asking a dumb question, what is particularly offensive about the dependency of Access Database Engine? You seem adamant that it is problematic, but I would like to know why if wouldn't mind explaining. – Adam Davis Oct 16 '18 at 19:00
  • 2
    @kevinc - That's true. But why do you mention it? The OP didn't ask about portability. – Enigmativity Dec 30 '18 at 07:33
  • @Enigmativity really? because it won't run on the vast majority of computers in the world. – kevinc Jan 01 '19 at 14:26
  • 1
    @kevinc - That's not true. You require a Windows PC (which are in the majority of desktop operating systems). – Enigmativity Jan 01 '19 at 21:53
  • Yeah, I agree with @Enigmativity - portability does not seem like the issue that Prisoner Zero was discussing. Obviously this isn't a choice if you're in a linux environment, but in a Windows environment, this dependency doesn't seem particularly worse than any other dependency. – Adam Davis Jan 29 '19 at 18:36
  • 1
    Note: .Core needs this nuget System.Data.OleDb instead access dll – Hamit YILDIRIM Sep 12 '21 at 09:04
36

I realize this question was asked nearly 7 years ago but it's still a top Google search result for certain keywords regarding importing excel data with C#, so I wanted to provide an alternative based on some recent tech developments.

Importing Excel data has become such a common task to my everyday duties, that I've streamlined the process and documented the method on my blog: best way to read excel file in c#.

I use NPOI because it can read/write Excel files without Microsoft Office installed and it doesn't use COM+ or any interops. That means it can work in the cloud!

But the real magic comes from pairing up with NPOI Mapper from Donny Tian because it allows me to map the Excel columns to properties in my C# classes without writing any code. It's beautiful.

Here is the basic idea:

I create a .net class that matches/maps the Excel columns I'm interested in:

        class CustomExcelFormat
        {
            [Column("District")]
            public int District { get; set; }

            [Column("DM")]
            public string FullName { get; set; }

            [Column("Email Address")]
            public string EmailAddress { get; set; }

            [Column("Username")]
            public string Username { get; set; }

            public string FirstName
            {
                get
                {
                    return Username.Split('.')[0];
                }
            }

            public string LastName
            {
                get
                {
                    return Username.Split('.')[1];
                }
            }
        }

Notice, it allows me to map based on column name if I want to!

Then when I process the excel file all I need to do is something like this:

        public void Execute(string localPath, int sheetIndex)
        {
            IWorkbook workbook;
            using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
            {
                workbook = WorkbookFactory.Create(file);
            }

            var importer = new Mapper(workbook);
            var items = importer.Take<CustomExcelFormat>(sheetIndex);
            foreach(var item in items)
            {
                var row = item.Value;
                if (string.IsNullOrEmpty(row.EmailAddress))
                    continue;

                UpdateUser(row);
            }

            DataContext.SaveChanges();
        }

Now, admittedly, my code does not modify the Excel file itself. I am instead saving the data to a database using Entity Framework (that's why you see "UpdateUser" and "SaveChanges" in my example). But there is already a good discussion on SO about how to save/modify a file using NPOI.

Dan
  • 3,583
  • 1
  • 23
  • 18
  • 2
    This works like a charm! Simplest solution so far. Both are available as NuGet package. – Stefan May 25 '19 at 09:31
  • 1
    Hi Dan, I couldn't resolve excel file processing without excel software or oledb drivers in server. But using this, i was able to achieve. Also good performance. Thanks a lot. – Ramakrishnankt Dec 24 '19 at 18:00
  • 1
    How can I use this for generic excel files without knowing their column names before? For example, I would rather want to export sheet into a DataTable. – JM217 Jan 10 '20 at 22:14
  • David Pio - Use the new `dynamic` feature. I guess that is what you are looking for. `mapper.Take(0).ToList();` – Piotr Kula Jul 01 '20 at 20:29
  • 1
    Awesome, worked like a charm and saved tons of time. Thanks! – Flatpick13 Jul 20 '20 at 20:46
  • Can you please share a glimpse of the actual excel file you have for this? Also, does this work with .net core? – shaikhspear Jan 14 '21 at 22:29
30

Using OLE Query, it's quite simple (e.g. sheetName is Sheet1):

DataTable LoadWorksheetInDataTable(string fileName, string sheetName)
{           
    DataTable sheetData = new DataTable();
    using (OleDbConnection conn = this.returnConnection(fileName))
    {
       conn.Open();
       // retrieve the data using data adapter
       OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "$]", conn);
       sheetAdapter.Fill(sheetData);
       conn.Close();
    }                        
    return sheetData;
}

private OleDbConnection returnConnection(string fileName)
{
    return new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + "; Jet OLEDB:Engine Type=5;Extended Properties=\"Excel 8.0;\"");
}

For newer Excel versions:

return new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=Excel 12.0;");

You can also use Excel Data Reader an open source project on CodePlex. Its works really well to export data from Excel sheets.

The sample code given on the link specified:

FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);

//1. Reading from a binary Excel file ('97-2003 format; *.xls)
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
//...
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
//...
//3. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
//...
//4. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();

//5. Data Reader methods
while (excelReader.Read())
{
//excelReader.GetInt32(0);
}

//6. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();

Reference: How do I import from Excel to a DataSet using Microsoft.Office.Interop.Excel?

Andrea
  • 6,032
  • 2
  • 28
  • 55
Furqan Safdar
  • 16,260
  • 13
  • 59
  • 93
6

Try to use this free way to this, https://freenetexcel.codeplex.com

 Workbook workbook = new Workbook();

 workbook.LoadFromFile(@"..\..\parts.xls",ExcelVersion.Version97to2003);
 //Initialize worksheet
 Worksheet sheet = workbook.Worksheets[0];

 DataTable dataTable = sheet.ExportDataTable();
Hark.Tenl
  • 69
  • 1
  • 1
  • There are limitations, like the number of sheets and number of rows readable. You have to buy a licence to remove these limitations... Also, the codeplex link is about to be broken. Here is the official product page : https://www.e-iceblue.com/Introduce/free-xls-component.html – Dude Pascalou Sep 01 '21 at 10:36
5

If you can restrict it to just (Open Office XML format) *.xlsx files, then probably the most popular library would be EPPLus.

Bonus is, there are no other dependencies. Just install using nuget:

Install-Package EPPlus
davewasthere
  • 2,988
  • 2
  • 24
  • 25
0

Try to use Aspose.cells library (not free, but trial is enough to read), it is quite good

Install-package Aspose.cells

There is sample code:

using Aspose.Cells;
using System;

namespace ExcelReader
{
    class Program
    {
        static void Main(string[] args)
        {
            // Replace path for your file
            readXLS(@"C:\MyExcelFile.xls"); // or "*.xlsx"
            Console.ReadKey();
        }

        public static void readXLS(string PathToMyExcel)
        {
            //Open your template file.
            Workbook wb = new Workbook(PathToMyExcel);

            //Get the first worksheet.
            Worksheet worksheet = wb.Worksheets[0];

            //Get cells
            Cells cells = worksheet.Cells;

            // Get row and column count
            int rowCount = cells.MaxDataRow;
            int columnCount = cells.MaxDataColumn;

            // Current cell value
            string strCell = "";

            Console.WriteLine(String.Format("rowCount={0}, columnCount={1}", rowCount, columnCount));

            for (int row = 0; row <= rowCount; row++) // Numeration starts from 0 to MaxDataRow
            {
                for (int column = 0; column <= columnCount; column++)  // Numeration starts from 0 to MaxDataColumn
                {
                    strCell = "";
                    strCell = Convert.ToString(cells[row, column].Value);
                    if (String.IsNullOrEmpty(strCell))
                    {
                        continue;
                    }
                    else
                    {
                        // Do your staff here
                        Console.WriteLine(strCell);
                    }
                }
            }
        }
    }
}
fr0ga
  • 333
  • 4
  • 9
0

Read from excel, modify and write back

 /// <summary>
/// /Reads an excel file and converts it into dataset with each sheet as each table of the dataset
/// </summary>
/// <param name="filename"></param>
/// <param name="headers">If set to true the first row will be considered as headers</param>
/// <returns></returns>
public DataSet Import(string filename, bool headers = true)
{
    var _xl = new Excel.Application();
    var wb = _xl.Workbooks.Open(filename);
    var sheets = wb.Sheets;
    DataSet dataSet = null;
    if (sheets != null && sheets.Count != 0)
    {
        dataSet = new DataSet();
        foreach (var item in sheets)
        {
            var sheet = (Excel.Worksheet)item;
            DataTable dt = null;
            if (sheet != null)
            {
                dt = new DataTable();
                var ColumnCount = ((Excel.Range)sheet.UsedRange.Rows[1, Type.Missing]).Columns.Count;
                var rowCount = ((Excel.Range)sheet.UsedRange.Columns[1, Type.Missing]).Rows.Count;

                for (int j = 0; j < ColumnCount; j++)
                {
                    var cell = (Excel.Range)sheet.Cells[1, j + 1];
                    var column = new DataColumn(headers ? cell.Value : string.Empty);
                    dt.Columns.Add(column);
                }

                for (int i = 0; i < rowCount; i++)
                {
                    var r = dt.NewRow();
                    for (int j = 0; j < ColumnCount; j++)
                    {
                        var cell = (Excel.Range)sheet.Cells[i + 1 + (headers ? 1 : 0), j + 1];
                        r[j] = cell.Value;
                    }
                    dt.Rows.Add(r);
                }

            }
            dataSet.Tables.Add(dt);
        }
    }
    _xl.Quit();
    return dataSet;
}



 public string Export(DataTable dt, bool headers = false)
    {
        var wb = _xl.Workbooks.Add();
        var sheet = (Excel.Worksheet)wb.ActiveSheet;
        //process columns
        for (int i = 0; i < dt.Columns.Count; i++)
        {
            var col = dt.Columns[i];
            //added columns to the top of sheet
            var currentCell = (Excel.Range)sheet.Cells[1, i + 1];
            currentCell.Value = col.ToString();
            currentCell.Font.Bold = true;
            //process rows
            for (int j = 0; j < dt.Rows.Count; j++)
            {
                var row = dt.Rows[j];
                //added rows to sheet
                var cell = (Excel.Range)sheet.Cells[j + 1 + 1, i + 1];
                cell.Value = row[i];
            }
            currentCell.EntireColumn.AutoFit();
        }
        var fileName="{somepath/somefile.xlsx}";
        wb.SaveCopyAs(fileName);
        _xl.Quit();
        return fileName;
    }
Beingnin
  • 2,288
  • 1
  • 21
  • 37
0

I used Office's NuGet Package: DocumentFormat.OpenXml and pieced together the code from that component's doc site.

With the below helper code, was similar in complexity to my other CSV file format parsing in that project...

public static async Task ImportXLSX(Stream stream, string sheetName) {
{
    // This was necessary for my Blazor project, which used a BrowserFileStream object
    MemoryStream ms = new MemoryStream();
    await stream.CopyToAsync(ms);

    using (var document = SpreadsheetDocument.Open(ms, false))
    {
        // Retrieve a reference to the workbook part.
        WorkbookPart wbPart = document.WorkbookPart;

        // Find the sheet with the supplied name, and then use that 
        // Sheet object to retrieve a reference to the first worksheet.
        Sheet theSheet = wbPart?.Workbook.Descendants<Sheet>().Where(s => s?.Name == sheetName).FirstOrDefault();

        // Throw an exception if there is no sheet.
        if (theSheet == null)
        {
            throw new ArgumentException("sheetName");
        }

        WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(theSheet.Id));
        // For shared strings, look up the value in the
        // shared strings table.
        var stringTable = 
            wbPart.GetPartsOfType<SharedStringTablePart>()
            .FirstOrDefault();

        // I needed to grab 4 cells from each row
        // Starting at row 11, until the cell in column A is blank
        int row = 11;

        while (true) {
                var accountNameCell = GetCell(wsPart, "A" + row.ToString());
                var accountName = GetValue(accountNameCell, stringTable);
                if (string.IsNullOrEmpty(accountName)) {
                    break;
                }
                var investmentNameCell = GetCell(wsPart, "B" + row.ToString());
                var investmentName = GetValue(investmentNameCell, stringTable);
                var symbolCell = GetCell(wsPart, "D" + row.ToString());
                var symbol = GetValue(symbolCell, stringTable);
                var marketValue = GetCell(wsPart, "J" + row.ToString()).InnerText;
                
                // DO STUFF with data

                row++;
        }
    }
}

private static string? GetValue(Cell cell, SharedStringTablePart stringTable) {
    try {
        return stringTable.SharedStringTable.ElementAt(int.Parse(cell.InnerText)).InnerText;
    } catch (Exception) {
        return null;
    }
}
private static Cell GetCell(WorksheetPart wsPart, string cellReference) {
    return wsPart.Worksheet.Descendants<Cell>().Where(c => c.CellReference.Value == cellReference)?.FirstOrDefault();
}
Rob Relyea
  • 401
  • 4
  • 3