3

Here's my situation. I'm designing a program that takes Excel files (which may be in csv, xls, or xlsx format) from a remote network drive, processes the data, then outputs and stores the results of that process. The program provides a listbox of filenames that are obtained from the remote network drive folder using the method detailed in the accepted answer here. Once the user selects a filename from the listbox, I want the program to find the file and obtain the information from it to do the data processing. I have tried using this method to read the data from the Excel file while in a threaded security context, but that method just fails without giving any kind of error. It seems to not terminate. Am I going about this the wrong way?

Edit - (Final Notes: I have taken out the OleDbDataAdapter and replaced it with EPPlus handling.)

I was able to scrub sensitive data from the code, so here it is:

protected void GetFile(object principalObj)
    {
        if (principalObj == null)
        {
            throw new ArgumentNullException("principalObj");
        }

        IPrincipal principal = (IPrincipal)principalObj;
        Thread.CurrentPrincipal = principal;
        WindowsIdentity identity = principal.Identity as WindowsIdentity;
        WindowsImpersonationContext impersonationContext = null;
        if (identity != null)
        {
            impersonationContext = identity.Impersonate();
        }
        try
        {
            string fileName = string.Format("{0}\\" + Files.SelectedValue, @"RemoteDirectoryHere");
            string connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.14.0; data source={0}; Extended Properties=Excel 14.0;", fileName);

            OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM Sheet1", connectionString);
            DataSet ds = new DataSet();

            adapter.Fill(ds, "Sheet1");

            dataTable = ds.Tables["Sheet1"];
        }
        finally
        {
            if (impersonationContext != null)
            {
                impersonationContext.Undo();
            }
        }
    }

Additional Edit

Now xlsx files have been added to the mix.

Third Party

Third party solutions are not acceptable in this case (unless they allow unrestricted commercial use).

Attempts - (Final Notes: Ultimately I had to abandon OleDb connections.)

I have tried all of the different connection strings offered, and I have tried them with just one file type at a time. None of the connection strings worked with any of the file types.

Permissions

The User does have access to the file and its directory.

Community
  • 1
  • 1
called2voyage
  • 252
  • 9
  • 28
  • 1
    Check out the Related questions list on the right of this page. The top question there has a lot of upvotes. – DOK Jul 02 '13 at 17:37
  • @DOK These don't appear to be answers to my questions. – called2voyage Jul 02 '13 at 17:39
  • 1
    Correction. CSV is not an Excel file. This is a text file – T.S. Jul 02 '13 at 17:48
  • @T.S. This is a good point, but the data is being entered into the CSV files using Excel. – called2voyage Jul 02 '13 at 17:52
  • 1
    Make sure you use right provider to read Excel files. If you use engine from office 12 on office 14, it will fail to read – T.S. Jul 02 '13 at 17:54
  • The code given in the sample method (to read excel files) depends on a driver which is not 64 bit compatible. Are you using the driver identified in the comments or changing your target to x86? Also, there's a different connection string for xlsx. Have you taken that into consideration? At the end of the day, without seeing any code, it's hard to identify your issue. – rie819 Jul 02 '13 at 17:58
  • My code doesn't vary from what is linked except by specifics which cannot be posted here. We *are* working with 64-bit machines, so that could be an issue. Also, we are not using xlsx, only xls and csv. – called2voyage Jul 02 '13 at 18:00
  • Make sure you use right provider to read Excel files. If you use engine from office 12 on office 14, it will fail to read. Again, Excel is might used as csv editor but it is not even the best at that. For example, csv file from France will be semicolon delimited and you will have issues opening it in the USA. You will have a single row in one cell. You will need to use advanced features to open it properly. Therefore, it might be even better to programmatically read csv files differently from xls. – T.S. Jul 02 '13 at 18:00
  • Besides, are you trying to get information about file or from within the file? I worked a lot with importing data from Excel. so, I am trying to understand what is the story here? – T.S. Jul 02 '13 at 18:02
  • @T.S. The purpose of this question is not to discuss the suitability of editing csv files with Excel. That is beyond the scope of my program. These files are being created and edited by other engineers. Also, non-US files will not be an issue in this case. Programmatically reading csv files differently *may* be an option, but it will be complicated by the fact that sometimes csv files are being saved into xls format and vice-versa. – called2voyage Jul 02 '13 at 18:03
  • @T.S. I am trying to get information from within the file. – called2voyage Jul 02 '13 at 18:04
  • Do you use Microsoft.ACE.OLEDB.14.0; and Microsoft.ACE.OLEDB.12.0; providers? – T.S. Jul 02 '13 at 18:11
  • @T.S. Yes, we use ACE. – called2voyage Jul 02 '13 at 18:14
  • 1
    Make sure you use right provider, whic hcorresponds to the version of Excel installed – T.S. Jul 02 '13 at 18:34
  • I just found [this](http://stackoverflow.com/questions/5500876/the-microsoft-ace-oledb-14-0-provider-is-not-registered-on-the-local-machine) which could be related since I am also using Office 2010 64-bit on Windows 7. – called2voyage Jul 02 '13 at 18:40
  • I have to admit that we've converted all our system to run in x86 because some of the components were not provided for x64. So, I had x86 code using x86 Excel on x64 machine and everything was fine. – T.S. Jul 02 '13 at 18:50
  • @T.S. I wish that were an option in this case. – called2voyage Jul 02 '13 at 18:55
  • I give you one more idea - in code create ODBC DSN for your file (write registry entries) and use ADO.net to connect to it. You can test it by manually doing it (via computer interface) and try to read your file. Remember, there are 2 ODBC managers on x64 machines. Try both. – T.S. Jul 02 '13 at 18:56
  • I do not have the permissions to do this to the input files. – called2voyage Jul 02 '13 at 18:59
  • This is puzzling answer – T.S. Jul 02 '13 at 19:00
  • @T.S. I was able to scrub sensitive data from my code and I have added it to the original question. – called2voyage Jul 03 '13 at 18:09
  • Try extended properties in qoutes, e.g. Extended Properties="Excel 8.0;HDR=YES" – T.S. Jul 03 '13 at 20:17
  • @T.S. That did not work. – called2voyage Jul 03 '13 at 20:31
  • Does the IIS pool user-ID have permissions to read the file and its directory? – David R Tribble Jul 11 '13 at 17:17
  • @DavidRTribble I'm using the end user's user-ID using the method in my first link (How do I start a thread in a different security context?). The end user user-ID does have permissions to read the file and its directory. – called2voyage Jul 11 '13 at 17:45

4 Answers4

3

Your connection string might be the issue here. As far as I know, there isn't 1 that can read all xls, csv, and xlsx. I think you're using the XLSX connection string.

When I read xls, i use the following connection string:

@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + sFilePath + ";Extended Properties='Excel 8.0;HDR=YES;IMEX=1;'"

Having said that, I recommend using a 3rd party file reader/parser to read XLS and CSV since, from my experience, OleDbDataAdapter is wonky depending on the types of data that's being read (and how mixed they are within each column).

For XLS, try NPOI https://code.google.com/p/npoi/

For CSV, try http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader

For XLSX, try EPPlus http://epplus.codeplex.com/

I've had great success with the above libraries.

banging
  • 2,540
  • 1
  • 22
  • 26
  • Now xlsx files are in the mix too, and none of these connection strings are working for csv, xls, or xlsx. I have tried them on each individually. A 3rd party solution is not acceptable in this case. Thanks for the links though. – called2voyage Jul 03 '13 at 18:24
  • 1
    Reading the files natively (as the 3rd party solutions are doing it) is the only way to read mixed data properly. Even if you do get it to read using oledb, you'll soon realize it's not reliable in certain situation. – banging Jul 03 '13 at 18:41
  • Even if I treat the extensions separately where possible? I find that hard to believe. Is there any way to read the files natively without 3rd party solutions? – called2voyage Jul 03 '13 at 18:46
  • 1
    @called2voyage They are just files, so if you can request and get them in memory you can parse them yourself but that will be a huge amount of code to write. They are each formatted differently and thus you'll need to write a different parser for each. That code is already written in open source projects. EPPlus is great for xlsx in my opinion and just requires you include a DLL in your project. No purchasing a license, no COM interop, no need for Excel being installed on the server. Not sure what you call 3rd party but permissively licensed .NET DLL is no different than your own code. – AaronLS Jul 11 '13 at 18:00
  • @AaronLS Does the open source license allow commercial use? – called2voyage Jul 11 '13 at 18:02
  • Yes, it is LGPL, so you can add the DLL to your commercial project. – AaronLS Jul 11 '13 at 18:06
  • @AaronLS I got it to work with EPPlus. If you post it as an answer, I'll award you the bounty. – called2voyage Jul 11 '13 at 21:47
  • @banging If you modify your answer to focus on the third-party solutions and feature EPPlus in particular (highlighting that is an LGPL-licensed project allowing commercial use), and remove the part saying that the issue was with the connection string (as I have confirmed that is not the case), then I can accept your answer. – called2voyage Jul 15 '13 at 13:49
  • @called2voyage There you go. The only reason i didn't suggest EPP to begin with is because XLSX was not in scope. – banging Jul 15 '13 at 13:54
1

Is it really important that you use an OleDb interface for this? I've always done it with Microsoft.Office.Excel.Interop, to wit:

using System;
using Microsoft.Office.Interop.Excel;

namespace StackOverflowExample
{
    class Program
    {
        static void Main(string[] args)
        {
            var app = new Application();
            var wkbk = app.Workbooks.Open(@"c:\data\foo.xls") as Workbook;
            var wksht = wkbk.Sheets[1] as Worksheet; // not zero-based!
            for (int row = 1; row <= 100; row++) // not zero-based!
            {
                Console.WriteLine("This is row #" + row.ToString());
                for (int col = 1; col <= 100; col++)
                {
                    Console.WriteLine("This is col #" + col.ToString());
                    var cell = wksht.Cells[row][col] as Range;
                    if (cell != null)
                    {
                        object val = cell.Value;
                        if (val != null)
                        {
                            Console.WriteLine("The value of the cell is " + val.ToString());
                        }
                    }
                }
            }
        }
    }
}
catfood
  • 4,267
  • 5
  • 29
  • 55
0

As you will be dealing with xlsx extension, you should rather opt for the new connection string.

public static string getConnectionString(string fileName, bool HDRValue, bool WriteExcel)
{
    string hdrValue = HDRValue ? "YES" : "NO";
    string writeExcel = WriteExcel ? string.Empty : "IMEX=1";
    return "Provider=Microsoft.ACE.OLEDB.12.0;" + "Data Source=" + fileName + ";" + "Extended Properties=\"Excel 12.0 xml;HDR=" + hdrValue + ";" + writeExcel + "\"";
}

Above is the code for getting the connection string. First argument expects the actual path for file location. Second argument will decide whether to consider first row values as column headers or not. Third argument helps decide whether you want to open the connection to create and write the data or simply read the data. To read the data set it to "FALSE"

public static ReadData(string filePath, string sheetName, List<string> fieldsToRead, int startPoint, int endPoint)
{
    DataTable dt = new DataTable();
    try
    {
        string ConnectionString = ProcessFile.getConnectionString(filePath, false, false);
        using (OleDbConnection cn = new OleDbConnection(ConnectionString))
        {
            cn.Open();
            DataTable dbSchema = cn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
            if (dbSchema == null || dbSchema.Rows.Count < 1)
            {
                throw new Exception("Error: Could not determine the name of the first worksheet.");
            }
            StringBuilder sb = new StringBuilder();
            sb.Append("SELECT *");
            sb.Append(" FROM [" + sheetName + fieldsToRead[0].ToUpper() + startPoint + ":" + fieldsToRead[1].ToUpper() + endPoint + "] ");
            OleDbDataAdapter da = new OleDbDataAdapter(sb.ToString(), cn);
            dt = new DataTable(sheetName);
            da.Fill(dt);
            if (dt.Rows.Count > 0)
            {
                foreach (DataRow row in dt.Rows)
                {
                string i = row[0].ToString();
                }
            }
            cn.Dispose();
            return fileDatas;
        }
    }
    catch (Exception)
    {
    }
}
Nisarg Shah
  • 354
  • 1
  • 14
0

This is for reading 2007 Excel into dataset

  DataSet ds = new DataSet();
        try
        {


            string myConnStr = "";

                myConnStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=MyDataSource;Extended Properties=\"Excel 12.0;HDR=YES\"";


            OleDbConnection myConn = new OleDbConnection(myConnStr);
            OleDbCommand cmd = new OleDbCommand("select * from [Sheet1$] ", myConn);
            OleDbDataAdapter adapter = new OleDbDataAdapter();
            adapter.SelectCommand = cmd;
            myConn.Open();
            adapter.Fill(ds);
            myConn.Close();
        }
        catch
        { }
        return ds;
Rohit
  • 10,056
  • 7
  • 50
  • 82