1

I'm trying to read a MS Office Excel document so I can test a group of regular expressions against it. I have something that works but it seems incredibly clumsy. Checking every single cell.... I just can't imagine this is the best way. Any suggestions to make this better.

    static void ReadMSOfficeExcelFile(string file) {
        try {
            Microsoft.Office.Interop.Excel.Application xlsApp = new Microsoft.Office.Interop.Excel.Application();
            object nullobj = System.Reflection.Missing.Value;
            object ofalse = false;
            object ofile = file;

            Microsoft.Office.Interop.Excel.Workbook xlsWorkbook = xlsApp.Workbooks.Open(
                                                             file, nullobj, nullobj,
                                                             nullobj, nullobj, nullobj,
                                                             nullobj, nullobj, nullobj,
                                                             nullobj, nullobj, nullobj,
                                                             nullobj, nullobj, nullobj);

            Microsoft.Office.Interop.Excel.Sheets xlsSheets = xlsWorkbook.Worksheets;

            foreach (Microsoft.Office.Interop.Excel.Worksheet xlsWorkSheet in xlsWorkbook.Sheets) {
                Microsoft.Office.Interop.Excel.Range xlsRange = xlsWorkSheet.UsedRange;
                foreach (Microsoft.Office.Interop.Excel.Range xlsRow in xlsRange.Rows) {
                    foreach (Microsoft.Office.Interop.Excel.Range xlsCell in xlsRow.Columns) {
                        CheckLineMatch(file, xlsCell.Text.ToString());
                    }
                }
            }
            xlsWorkbook.Close(ofalse, nullobj, nullobj);
        }
        catch {
            PrintError("Unable to parse file because of MS Office error.", file);
        }
    }
dsolimano
  • 8,870
  • 3
  • 48
  • 63
MTeck
  • 1,642
  • 3
  • 16
  • 34

3 Answers3

0

Use ClosedXML and loop through columnsets http://ishwarnataraj.blogspot.com/2015/10/excel-regular-expression-parsing.html

0

You should check out this library:

http://code.google.com/p/excellibrary/

The aim of this project is provide a native .NET solution to create, read and modify Excel files without using COM interop or OLEDB connection.

Currently .xls (BIFF8) format is implemented. In future .xlsx (Excel 2007) may also be supported.

It's nice but here isn't a whole lot of documentation so you might go through some struggles at first.

Community
  • 1
  • 1
evasilchenko
  • 1,862
  • 1
  • 13
  • 26
0

I kept looking at that other post. I noticed the post that reads:

""" How about Excel Data Reader?

http://exceldatareader.codeplex.com/

I've used in it anger, in a production environment, to pull large amounts of data from a variety of Excel files into SQL Server Compact. It works very well and it's rather robust. """

It ended up working very well for what I need.

    static void ReadMSOffice2003ExcelFile(string file) {
        FileStream stream = File.Open(file, FileMode.Open, FileAccess.Read);
        ReadMSOfficeExcelFile(file, ExcelReaderFactory.CreateBinaryReader(stream));
    }


    static void ReadMSOffice2007ExcelFile(string file) {
        FileStream stream = File.Open(file, FileMode.Open, FileAccess.Read);
        ReadMSOfficeExcelFile(file, ExcelReaderFactory.CreateOpenXmlReader(stream));
    }


    static void ReadMSOfficeExcelFile(string file, IExcelDataReader xlsReader) {
        string xlsRow;
        while (xlsReader.Read()) {
            xlsRow = "";
            for (int i = 0; i < xlsReader.FieldCount; i++) {
                xlsRow += " " + xlsReader.GetString(i);
            }
            CheckLineMatch(file, xlsRow);
        }
    }
MTeck
  • 1,642
  • 3
  • 16
  • 34