0

How can I use OLEDB to parse and import a CSV file that each cell is encased in double quotes because some rows contain commas in them?? I am unable to change the format as it is coming from a vendor.

I am trying the following and it is failing with an IO error:

public DataTable ConvertToDataTable(string fileToImport, string fileDestination)
{
    string fullImportPath = fileDestination + @"\" + fileToImport;
    OleDbDataAdapter dAdapter = null;
    DataTable dTable = null;

    try
    {
        if (!File.Exists(fullImportPath))
            return null;

        string full = Path.GetFullPath(fullImportPath);
        string file = Path.GetFileName(full);
        string dir = Path.GetDirectoryName(full);


        //create the "database" connection string
        string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
          + "Data Source=\"" + dir + "\\\";"
          + "Extended Properties=\"text;HDR=No;FMT=Delimited\"";

        //create the database query
        string query = "SELECT * FROM " + file;

        //create a DataTable to hold the query results
        dTable = new DataTable();

        //create an OleDbDataAdapter to execute the query
        dAdapter = new OleDbDataAdapter(query, connString);


        //fill the DataTable
        dAdapter.Fill(dTable);
    }
    catch (Exception ex)
    {
        throw new Exception(CLASS_NAME + ".ConvertToDataTable: Caught Exception: " + ex);
    }
    finally
    {
        if (dAdapter != null)
            dAdapter.Dispose();
    }

    return dTable;
}

When I use a normal CSV it works fine. Do I need to change something in the connString??

John Saunders
  • 160,644
  • 26
  • 247
  • 397
gcoleman0828
  • 1,541
  • 3
  • 30
  • 49

8 Answers8

3

Use a dedicated CSV parser.

There are many out there. A popular one is FileHelpers, though there is one hidden in the Microsoft.VisualBasic.FileIO namespace - TextFieldParser.

Oded
  • 489,969
  • 99
  • 883
  • 1,009
1

You can use this code : MS office required

  private void ConvertCSVtoExcel(string filePath = @"E:\nucc_taxonomy_140.csv", string tableName = "TempTaxonomyCodes")
    {
        string tempPath = System.IO.Path.GetDirectoryName(filePath);
        string strConn = @"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + tempPath + @"\;Extensions=asc,csv,tab,txt";
        OdbcConnection conn = new OdbcConnection(strConn);
        OdbcDataAdapter da = new OdbcDataAdapter("Select * from " + System.IO.Path.GetFileName(filePath), conn);
        DataTable dt = new DataTable();
        da.Fill(dt);

        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConfigurationSettings.AppSettings["dbConnectionString"]))
        {
            bulkCopy.DestinationTableName = tableName;
            bulkCopy.BatchSize = 50;
            bulkCopy.WriteToServer(dt);
        }

    }
Vishal Sen
  • 1,135
  • 1
  • 13
  • 23
1

There is a lot to consider when handling CSV files. However you extract them from the file, you should know how you are handling the parsing. There are classes out there that can get you part way, but most don't handle the nuances that Excel does with embedded commas, quotes and line breaks. However, loading Excel or the MS classes seems a lot of freaking overhead if you just want parse a txt file like a CSV.

One thing you can consider is doing the parsing in your own Regex, which will also make your code a little more platform independent, in case you need to port it to another server or application at some point. Using regex has the benefit of also being accessible in virtually every language. That said, there are some good regex patterns out there that handle the CSV puzzle. Here is my shot at it, which does cover embedded commas, quotes and line breaks. Regex code/pattern and explanation :

http://www.kimgentes.com/worshiptech-web-tools-page/2008/10/14/regex-pattern-for-parsing-csv-files-with-embedded-commas-dou.html

Hope that is of some help..

Kim Gentes
  • 1,496
  • 1
  • 18
  • 38
1

Have a look at FileHelpers.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
0
 private static void Mubashir_CSVParser(string s)
        {
            // extract the fields
            Regex RegexCSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
            String[] Fields = RegexCSVParser.Split(s);

            // clean up the fields (remove " and leading spaces)
            for (int i = 0; i < Fields.Length; i++)
            {
                Fields[i] = Fields[i].TrimStart(' ', '"');
                Fields[i] = Fields[i].TrimEnd('"');// this line remove the quotes
                //Fields[i] = Fields[i].Trim();
            }
        }
Muhammad Mubashir
  • 1,591
  • 1
  • 21
  • 18
0

Try the code from my answer here:

Reading CSV files in C#

It handles quoted csv just fine.

Community
  • 1
  • 1
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • I know YOU know about the close as duplicate feature! – Ben Voigt Oct 21 '11 at 20:05
  • 2
    @Ben I take a very narrow view about what to consider an **exact** duplicate. He may have other restrictions on him that require him to use the OleDb parser, and then the question is narrowed to problem with quote-enclosed text. I don't recall seeing that specific question before. – Joel Coehoorn Oct 21 '11 at 20:09
  • @user take a look Microsoft.VisualBasic.TextFieldParser. It's built in to .Net. – Joel Coehoorn Oct 21 '11 at 21:02
  • So I have been trying to get TextFieldParser to work and I am stuck where it is still reading it as one long line. I have it HasFieldsEncosedinQuotes = true and the delimiter is ",". Any ideas? – gcoleman0828 Oct 24 '11 at 13:05
-1

This is what I used in a project, parses a single line of data.

    private string[] csvParser(string csv, char separator = ',')
    {
        List <string> parsed = new List<string>();
        string[] temp = csv.Split(separator);
        int counter = 0;
        string data = string.Empty;
        while (counter < temp.Length)
        {
            data = temp[counter].Trim();
            if (data.Trim().StartsWith("\""))
            {
                bool isLast = false;
                while (!isLast && counter < temp.Length)
                {
                    data += separator.ToString() + temp[counter + 1];
                    counter++;
                    isLast = (temp[counter].Trim().EndsWith("\""));
                }
            }
            parsed.Add(data);
            counter++;
        }

        return parsed.ToArray();

    }

http://zamirsblog.blogspot.com/2013/09/c-csv-parser-csvparser.html

Community
  • 1
  • 1
Zamir
  • 79
  • 4
-1

Just incase anyone has a similar issue, i wanted to post the code i used. i did end up using Textparser to get the file and parse ot the columns, but i am using recrusion to get the rest done and substrings.

 /// <summary>
        /// Parses each string passed as a "row".
        /// This routine accounts for both double quotes
        /// as well as commas currently, but can be added to
        /// </summary>
        /// <param name="row"> string or row to be parsed</param>
        /// <returns></returns>
        private List<String> ParseRowToList(String row)
        {
            List<String> returnValue = new List<String>();

            if (row[0] == '\"')
            {// Quoted String
                if (row.IndexOf("\",") > -1)
                {// There are more columns
                    returnValue = ParseRowToList(row.Substring(row.IndexOf("\",") + 2));
                    returnValue.Insert(0, row.Substring(1, row.IndexOf("\",") - 1));
                }
                else
                {// This is the last column
                    returnValue.Add(row.Substring(1, row.Length - 2));
                }
            }
            else
            {// Unquoted String
                if (row.IndexOf(",") > -1)
                {// There are more columns
                    returnValue = ParseRowToList(row.Substring(row.IndexOf(",") + 1));
                    returnValue.Insert(0, row.Substring(0, row.IndexOf(",")));
                }
                else
                {// This is the last column
                    returnValue.Add(row.Substring(0, row.Length));
                }
            }

            return returnValue;

        }

Then the code for Textparser is:

 // string pathFile = @"C:\TestFTP\TestCatalog.txt";
            string pathFile = @"C:\TestFTP\SomeFile.csv";

            List<String> stringList = new List<String>();
            TextFieldParser fieldParser = null;
            DataTable dtable = new DataTable();

            /* Set up TextFieldParser
                *  use the correct delimiter provided
                *  and path */
            fieldParser = new TextFieldParser(pathFile);
            /* Set that there are quotes in the file for fields and or column names */
            fieldParser.HasFieldsEnclosedInQuotes = true;

            /* delimiter by default to be used first */
            fieldParser.SetDelimiters(new string[] { "," });

            // Build Full table to be imported
            dtable = BuildDataTable(fieldParser, dtable);
gcoleman0828
  • 1,541
  • 3
  • 30
  • 49
  • 1
    It would be nice if people would explain why they down voted answer.. it answers my own question, not sure how that can get down voted – gcoleman0828 Jan 13 '15 at 01:49