0

I am using OLEDB to parse CSV file into DataTable. It is working fine but it creates an issue when some value in CSV contains double quote("). OLEDB skips the value of the of the next remaining columns in that row.

For example, I have following values in CSV file. Here value of Col2 in second row contain the double quote(").

enter image description here

When I parse the CSV to DataTable then DataTable contains the following values. Here the value of col3 and col4 in blank in the second row.

enter image description here

I am using following connection string

"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties='text;HDR=Yes;IMEX=1;ColNameHeader=True;CharacterSet=65001;FMT=Delimited(,)'"

And query is

"select * from [" + fileName + "]"

Here is the full code

string connStr = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties='text;HDR=Yes;IMEX=1;ColNameHeader=True;CharacterSet=65001;FMT=Delimited(,)'";
using (OleDbConnection conn = new OleDbConnection(connStr))
{
    conn.Open();
    table = new DataTable();
    var dataAdapter = new OleDbDataAdapter("select * from [" + fileName + "]", conn);
    dataAdapter.Fill(table);
}

How can I ignore double quote(") from value?

Note: I am downloading the file from the third party directly use that this CSV for parsing.

Girish
  • 427
  • 2
  • 6
  • 19
  • it is in the visualbasic namespace, but certainly usable from c#. there is a class called TextFieldParser that makes processing CSVs easy. https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx maybe try this instead of oledb? – Jeremy Jan 12 '18 at 14:21
  • @Jeremy - Will it work for file size > 300MB? – Girish Jan 12 '18 at 14:29
  • 1
    Its constructor is off of a standard .NET stream object, so you can use whatever streaming strategy works best in your situation. – Jeremy Jan 12 '18 at 14:43

1 Answers1

0

You going to need to create schema.ini file and place it in the same directory as the files you're pulling the data from. In it you will have to give TextDelimiter and Format to the csv file. Set the TextDelimiter to none

[YOURCSVFILENAME.csv]
ColNameHeader=True
Format=CSVDelimited
TextDelimiter=none

Hope this helps.

UPDATE: Creating schema.ini file dynamically...

  string csvFilePath = /* CSV file directory */
  string csvFileName = /* CSV file name */
  using (FileStream sr = new FileStream(csvFilePath + "\\schema.ini", 
      FileMode.Create, FileAccess.Write)) 
  { 
      using (StreamWriter writer=new StreamWriter(sr)) 
      { 
          writer.WriteLine("[" + csvFileName + "]"); 
          writer.WriteLine("ColNameHeader=True"); 
          writer.WriteLine("Format=CSVDelimited"); 
          writer.WriteLine("TextDelimiter=none"); 
          writer.Close(); 
          writer.Dispose(); 
      } 
  } 
Cinchoo
  • 6,088
  • 2
  • 19
  • 34
  • The CSV file name is dynamic. CSV file name is based on the current date. So how can I provide a dynamic file name in schema file? – Girish Jan 15 '18 at 04:21
  • And also I am reading multiple CSV files with the different format from a single folder. – Girish Jan 15 '18 at 05:53
  • In that case, you will have to create the schema.ini file dynamically before your open connection and query. See update above for sample code. – Cinchoo Jan 15 '18 at 13:14