2

I need to fill two datatables with two excel files. The files can be a Comma-separated values file (.csv) or it can be an Excel document (.xlsx) file. I use a function to convert my .csv to a datatable :


public DataTable GetDataTableFromCsv(string path)
{
    DataTable dataTable = new DataTable();
    String[] values;

    values = File.ReadAllLines(path);

    string[] csvRows = System.IO.File.ReadAllLines(path);
    string[] headers = csvRows[0].Split(',');

    // Adding columns name
    foreach (var item in headers)
        dataTable.Columns.Add(new DataColumn(item));

    string[] fields = null;

    foreach (string csvRow in csvRows)
    {
        fields = csvRow.Split(',');
        DataRow row = dataTable.NewRow();
        row.ItemArray = fields;
        dataTable.Rows.Add(row);
    }
    return dataTable;
}

The problem occurs when I try to compare my two datatables with :

ieDiff = dt1.AsEnumerable().Except(dt2.AsEnumerable(), DataRowComparer.Default);

My first datatable obtained by a .xlsx file contains the correct values ​​and my second datatable obtained by a .csv file contains the correct values​​, but with some extra quotes.

ex :

dt1.Rows[10] (datatable from .csv) will contains :

  1. Item1
  2. Item2
  3. "."
  4. "Item4"

dt2.Rows[10] (datatable from .xlsx) will contains :

  1. Item1
  2. Item2
  3. .
  4. Item4

I know I can trim all of these, but there is certainly a better way to perform this?

Gabriel
  • 27
  • 5
MHeads
  • 267
  • 1
  • 7
  • 18
  • If you're in C#, why not deal with the Excel files directly and skip the .csv stage? CSV is a horribly stupid format on even the best of days. – Marc B Apr 04 '13 at 15:22
  • I know this, but I can't do that. I need to work with this unfortunately. – MHeads Apr 04 '13 at 15:23
  • Dou you need the quotes? If not, then simply add this line in your foreach loop fields = fields.Replace("\"",""); – Mark Kram Apr 04 '13 at 15:30
  • I don't need those quotes because it gives me all datarows as an edited/added rows when I try to compare my csv datatable with my xlsx datatable. – MHeads Apr 04 '13 at 15:35

2 Answers2

1

The easiest answer is one you mentioned in the question which is to trim the quote character.

My test code:

var test = new[] { "Item1", "Item2", "\".\"", "\"Item4\"" };
foreach (var s in test)
{
    textBox1.AppendTextAddNewLine(s.Trim('"'));
}

Resulting Output:

Item1
Item2
.
Item4
Charles380
  • 1,269
  • 8
  • 19
  • Note that you may also have to un-escape quotes that are part of a field value. See this question on how to escape CSV files meant for Excel; http://stackoverflow.com/questions/165042/stop-excel-from-automatically-converting-certain-text-values-to-dates, and this page for the official CSV specification on the matter; http://tools.ietf.org/html/rfc4180#page-3 – Sphinxxx Apr 04 '13 at 19:06
  • @Sphinxxx true, but the trim will only trim the quote characters at the beginning or end not in the middle of the string, so in this case I believe it should be ok, but I'm not a quote escaping/un-escaping expert by any means – Charles380 Apr 05 '13 at 14:31
0

I have written five methods below that will turn a Csv file into a DataTable.

They have been designed to take into account optional quote marks (e.g. " symbols) and to be as versatile as possible without using other libraries:

    public static DataTable GetDataTabletFromCSVFile(string filePath, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            using (TextFieldParser TextFieldParser = new TextFieldParser(filePath))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    public static DataTable GetDataTableFromCsvString(string csvBody, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            MemoryStream MemoryStream = new MemoryStream();


            StreamWriter StreamWriter = new StreamWriter(MemoryStream);

            StreamWriter.Write(csvBody);

            StreamWriter.Flush();


            MemoryStream.Position = 0;


            using (TextFieldParser TextFieldParser = new TextFieldParser(MemoryStream))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    public static DataTable GetDataTableFromRemoteCsv(string url, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(url);
            HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();

            StreamReader StreamReader = new StreamReader(httpWebResponse.GetResponseStream());

            using (TextFieldParser TextFieldParser = new TextFieldParser(StreamReader))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }


    private static DataTable GetDataTableFromTextFieldParser(TextFieldParser textFieldParser)
    {
        DataTable MethodResult = null;
        try
        {
            textFieldParser.SetDelimiters(new string[] { "," });

            textFieldParser.HasFieldsEnclosedInQuotes = true;


            string[] ColumnFields = textFieldParser.ReadFields();

            DataTable dt = new DataTable();

            foreach (string ColumnField in ColumnFields)
            {
                DataColumn DataColumn = new DataColumn(ColumnField);

                DataColumn.AllowDBNull = true;

                dt.Columns.Add(DataColumn);

            }


            while (!textFieldParser.EndOfData)
            {
                string[] Fields = textFieldParser.ReadFields();


                for (int i = 0; i < Fields.Length; i++)
                {
                    if (Fields[i] == "")
                    {
                        Fields[i] = null;

                    }

                }

                dt.Rows.Add(Fields);

            }

            MethodResult = dt;

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    private static DataTable GetDataTableFromTextFieldParserNoHeadings(TextFieldParser textFieldParser)
    {
        DataTable MethodResult = null;
        try
        {
            textFieldParser.SetDelimiters(new string[] { "," });

            textFieldParser.HasFieldsEnclosedInQuotes = true;

            bool FirstPass = true;

            DataTable dt = new DataTable();

            while (!textFieldParser.EndOfData)
            {
                string[] Fields = textFieldParser.ReadFields();

                if(FirstPass)
                {
                    for (int i = 0; i < Fields.Length; i++)
                    {
                        DataColumn DataColumn = new DataColumn("Column " + i);

                        DataColumn.AllowDBNull = true;

                        dt.Columns.Add(DataColumn);

                    }

                    FirstPass = false;

                }

                for (int i = 0; i < Fields.Length; i++)
                {
                    if (Fields[i] == "")
                    {
                        Fields[i] = null;

                    }

                }

                dt.Rows.Add(Fields);

            }

            MethodResult = dt;

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

If, like me, you're saving from reporting services then you should use it like this:

    Warning[] warnings;
    string[] streamids;
    string mimeType;
    string encoding;
    string filenameExtension;

    byte[] bytes = rvMain.ServerReport.Render("csv", null, out mimeType, out encoding, out filenameExtension, out streamids, out warnings);

    string CsvBody = System.Text.Encoding.UTF8.GetString(bytes);

    DataTable dt = GetDataTableFromCsvString(CsvBody,true);

Otherwise, all you need do is:

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTableFromCsvString(CsvBody, IsHeadings);

Or to use directly from a csv file

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTabletFromCsvFile(FilePath, IsHeadings)

Or to use a csv file that is stored remotely

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTabletFromRemoteCsv(Url, IsHeadings)
WonderWorker
  • 8,539
  • 4
  • 63
  • 74
  • Note for Googlers: This uses the `TextFieldParser` class from `Microsoft.VisualBasic.FileIO`, which is not currently available in .NET Core. – Hoppeduppeanut Apr 16 '19 at 04:56