5

I want to read a csv-file into a Datagridview. I would like to have a class and a function which reads the csv like this one:

class Import
{
    public DataTable readCSV(string filePath)
    {            
        DataTable dt = new DataTable();
        using (StreamReader sr = new StreamReader(filePath))
        {
            string strLine = sr.ReadLine();     

            string[] strArray = strLine.Split(';');

            foreach (string value in strArray)
            {
                dt.Columns.Add(value.Trim());
            } 
            DataRow dr = dt.NewRow();

            while (sr.Peek() >= 0)
            {
                strLine = sr.ReadLine();
                strArray = strLine.Split(';');
                dt.Rows.Add(strArray);
            }
        }
        return dt;
     }   
}

and call it:

Import imp = new Import();

DataTable table = imp.readCSV(filePath);
foreach(DataRow row in table.Rows)
{
 dataGridView.Rows.Add(row);
}

Result of this is-> rows are created but there is no data in the cells!!

Destic
  • 71
  • 1
  • 1
  • 4

4 Answers4

9

First solution using a litle bit of linq

public DataTable readCSV(string filePath)
{
    var dt = new DataTable();
    // Creating the columns
    File.ReadLines(filePath).Take(1)
        .SelectMany(x => x.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries))
        .ToList()
        .ForEach(x => dt.Columns.Add(x.Trim()));

    // Adding the rows
    File.ReadLines(filePath).Skip(1)
        .Select(x => x.Split(';'))
        .ToList()
        .ForEach(line => dt.Rows.Add(line));
    return dt;
}

Below another version using foreach loop

public DataTable readCSV(string filePath)
{
    var dt = new DataTable();
    // Creating the columns
    foreach(var headerLine in File.ReadLines(filePath).Take(1))
    {
        foreach(var headerItem in headerLine.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries))
        {
            dt.Columns.Add(headerItem.Trim());
        }
    }

    // Adding the rows
    foreach(var line in File.ReadLines(filePath).Skip(1))
    {
        dt.Rows.Add(x.Split(';'));
    }
    return dt;
}

First we use the File.ReadLines, that returns an IEnumerable that is a colletion of lines. We use Take(1), to get just the first row, that should be the header, and then we use SelectMany that will transform the array of string returned from the Split method in a single list, so we call ToList and we can now use ForEach method to add Columns in DataTable.

To add the rows, we still use File.ReadLines, but now we Skip(1), this skip the header line, now we are going to use Select, to create a Collection<Collection<string>>, then again call ToList, and finally call ForEach to add the row in DataTable. File.ReadLines is available in .NET 4.0.

Obs.: File.ReadLines doesn't read all lines, it returns a IEnumerable, and lines are lazy evaluated, so just the first line will be loaded two times.

See the MSDN remarks

The ReadLines and ReadAllLines methods differ as follows: When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.

You can use the ReadLines method to do the following:

Perform LINQ to Objects queries on a file to obtain a filtered set of its lines.

Write the returned collection of lines to a file with the File.WriteAllLines(String, IEnumerable) method, or append them to an existing file with the File.AppendAllLines(String, IEnumerable) method.

Create an immediately populated instance of a collection that takes an IEnumerable collection of strings for its constructor, such as a IList or a Queue.

This method uses UTF8 for the encoding value.

If you still have any doubt look this answer: What is the difference between File.ReadLines() and File.ReadAllLines()?

Second solution using CsvHelper package

First, install this nuget package

PM> Install-Package CsvHelper 

For a given CSV, we should create a class to represent it

CSV File

Name;Age;Birthdate;Working
Alberto Monteiro;25;01/01/1990;true
Other Person;5;01/01/2010;false

The class model is

public class Person
{
    public string Name { get; set; }
    public int Age { get; set; }
    public DateTime Birthdate { get; set; }
    public bool Working { get; set; }
}

Now lets use CsvReader to build the DataTable

public DataTable readCSV(string filePath)
{
    var dt = new DataTable();

    var csv = new CsvReader(new StreamReader(filePath));
    // Creating the columns
    typeof(Person).GetProperties().Select(p => p.Name).ToList().ForEach(x => dt.Columns.Add(x));

    // Adding the rows
    csv.GetRecords<Person>().ToList.ForEach(line => dt.Rows.Add(line.Name, line.Age, line.Birthdate, line.Working));
    return dt;
}

To create columns in DataTable e use a bit of reflection, and then use the method GetRecords to add rows in DataTabble

Community
  • 1
  • 1
Alberto Monteiro
  • 5,989
  • 2
  • 28
  • 40
  • I would have thought you need the empty entries. Or am i reading it wrong? – Hugh Jones Jan 05 '16 at 13:32
  • 1
    @HughJones The following string `abc;abc;;abc` if you split by `;` you will have 4 items => `abc | abc | | abc`, if you dont want there empty entries use **StringSplitOptions.RemoveEmptyEntries**. So in this case, I will change the answer, in rows I want the empty entries, but in header, I dont. – Alberto Monteiro Jan 05 '16 at 14:11
  • in both, I would have thought. If the number of fields in the header mismatches the number of fields in a row then that would be undesirable. Nice solution though +1 – Hugh Jones Jan 05 '16 at 15:30
  • Won't the two calls to `File.ReadLines` read the entire file twice? What's the performance penalty of using Reflection? – ardila Jan 05 '16 at 16:46
  • @aardila the File.ReadLines doesn't read all lines, it returns a IEnumerable, and lines are lazy evaluated, so just the first line will be loaded two times. Will changed my answer to explain this. – Alberto Monteiro Jan 05 '16 at 17:17
  • @HughJones see my comment about File.ReadLines – Alberto Monteiro Jan 05 '16 at 17:17
  • @AlbertoMonteiro Nice solution thanks a lot for this one! Only bug i have with the first solution->reads empty lines at the end of a file, and it doesn't trim the row entries.. – Destic Jan 06 '16 at 08:42
  • @AlbertoMonteiro just checked using Process Monitor and indeed it doesn't read the entire file twice as I had suspected. It opens the file, reads the first chunk, closes it, then the second call opens it again and reads the entire file. Should be noted that the `ReadLines` method is available as of .NET 4, though. – ardila Jan 06 '16 at 08:53
  • @Destic - any progress? – Alberto Monteiro Jan 06 '16 at 10:21
  • @AlbertoMonteiro Yes. Finally i used the method with TextFieldParser because it works well and its code is easy to understand. But thanks for your effort! – Destic Jan 06 '16 at 11:10
  • @Destic Its sad to hear that my code isn't easy to understand. I would suggest you to learn about LINQ, it is very useful. I am going to change my question and add another version of first solution, using basic foreach – Alberto Monteiro Jan 06 '16 at 11:18
  • @Destic btw, the other code has many for and if/else, and it is really easy to understand? – Alberto Monteiro Jan 06 '16 at 11:24
  • @AlbertoMonteiro - fwiw, I thought your (linq) solution was far more elegant than mine. The only reason I persisted was because I could see a problem with your solution, namely that a ';' embedded in the data would cause a failure. The `'if..else'`s in my code are all there to support files with no header line as well as those with - the OP did not require that (I guess) so I suppose I got lucky. – Hugh Jones Jan 06 '16 at 12:05
  • @HughJones If there is a `;` inside the row and it is part of data instead the delimiter it will fail, so I suggest the use of CsvReader, that already handle that. But np, I just asked that because I want to people growth up with answers, not just copy the answer and paste and done. – Alberto Monteiro Jan 06 '16 at 12:19
  • @AlbertoMonteiro I have removed the 'if..else' in my code because i dont need it so its very easy to understand. The thing is that not only do I have to understand the code but also external people who have never come into contact with C# but have experience in C programming. Your version with foreach loops is clean and simple and is exactly what i was looking for. Thanks mates for your efforts! – Destic Jan 06 '16 at 12:21
  • @AlbertoMonteiro that's the right attitude!! but in my case its ok because my data must exactly match a scheme ! – Destic Jan 06 '16 at 12:25
  • Have I just been robbed? – Hugh Jones Jan 06 '16 at 12:47
  • @Destic I didn't want that you accept mine question, I just wont to help you with a solution with nice performance and less code, and less cyclomatic complexity. If Hugh's answer is the solution for you, still with him. I just wanted to share some knowledge. You can gave reputation back to Hugh – Alberto Monteiro Jan 06 '16 at 12:52
  • @AlbertoMonteiro - I think he copied and pasted your second version, so fair enough – Hugh Jones Jan 06 '16 at 12:55
  • @HughJones I really dont care about this, finally i used your version but i think for others his detailed response is more useful.. ! – Destic Jan 06 '16 at 13:06
4

using Microsoft.VisualBasic.FileIO;

I would suggest the following. It should have the advantage at least that ';' in a field will be correctly handled, and it is not constrained to a particular csv format.

public class CsvImport
{
    public static DataTable NewDataTable(string fileName, string delimiters, bool firstRowContainsFieldNames = true)
    {
        DataTable result = new DataTable();

        using (TextFieldParser tfp = new TextFieldParser(fileName))
        {
            tfp.SetDelimiters(delimiters); 

            // Get Some Column Names
            if (!tfp.EndOfData)
            {
                string[] fields = tfp.ReadFields();

                for (int i = 0; i < fields.Count(); i++)
                {
                    if (firstRowContainsFieldNames)
                        result.Columns.Add(fields[i]);
                    else 
                        result.Columns.Add("Col" + i);
                }

                // If first line is data then add it
                if (!firstRowContainsFieldNames)
                    result.Rows.Add(fields); 
            }

            // Get Remaining Rows
            while (!tfp.EndOfData) 
                result.Rows.Add(tfp.ReadFields());
        }

        return result;
    } 
}
Hugh Jones
  • 2,706
  • 19
  • 30
  • thanks, looks like a reliable way to do it but where can i find VisualBasic.FileIO Namespace? – Destic Jan 06 '16 at 08:38
  • Add a reference to `Microsoft.Visualbasic` to your project – Hugh Jones Jan 06 '16 at 08:41
  • @Destic - as for reliable; there is a basic assumption that the csv file being loaded is valid in the first place. – Hugh Jones Jan 06 '16 at 08:52
  • your method works well. i just edited the while at the end because i wanted to haveDataTable including rows as result. `while (!tfp.EndOfData) { string[] fields = tfp.ReadFields(); DataRow dr = result.NewRow(); result.Rows.Add(fields); }` – Destic Jan 06 '16 at 08:59
  • @destic - I have slightly shortened/simplified the code – Hugh Jones Jan 06 '16 at 09:14
  • @Destic If performance is at all of interest, check http://stackoverflow.com/a/20456597 regarding performance of `TextFieldParser` vs `string.Split` – ardila Jan 06 '16 at 13:26
0

CsvHelper's Author build functionality in library. Code became simply:

using (var reader = new StreamReader("path\\to\\file.csv"))
using (var csv = new CsvReader(reader, CultureInfo.CurrentCulture))
{
    // Do any configuration to `CsvReader` before creating CsvDataReader.
    using (var dr = new CsvDataReader(csv))
    {        
        var dt = new DataTable();
        dt.Load(dr);
    }
}

CultureInfo.CurrentCulture is used to determine the default delimiter and needs if you want to read csv saved by Excel.

TitanRain
  • 99
  • 1
  • 4
0

I had the same problem but I found a way to use @Alberto Monteiro's Answer in my own way...

My CSV file does not have a "First-Line-Column-Header", I personally didn't put them there for some reasons, So this is the file sample

1,john doe,j.doe,john.doe@company.net
2,jane doe,j.doe,jane.doe@company.net

So you got the idea right ?

Now in I am going to add the Columns manually to the DataTable. And also I am going to use Tasks to do it asynchronously. and just simply using a foreach loop adding the values into the DataTable.Rows using the following function:

public Task<DataTable> ImportFromCSVFileAsync(string filePath)
{
    return Task.Run(() =>
    {
        DataTable dt = new DataTable();
        dt.Columns.Add("Index");
        dt.Columns.Add("Full Name");
        dt.Columns.Add("User Name");
        dt.Columns.Add("Email Address");

        // splitting the values using Split() command 
        foreach(var srLine in File.ReadAllLines(filePath))
        {
            dt.Rows.Add(srLine.Split(','));
        }
        return dt;
    });
}

Now to call the function I simply ButtonClick to do the job


private async void ImportToGrid_STRBTN_Click(object sender, EventArgs e)
{
   // Handling UI objects
   // Best idea for me was to put everything a Panel and Disable it while waiting
   // and after the job is done Enabling it
   // and using a toolstrip docked to bottom outside of the panel to show progress using a 
   // progressBar and setting its style to Marquee

   panel1.Enabled = false;
   progressbar1.Visible = true;
   try
   {
      DataTable dt = await ImportFromCSVFileAsync(@"c:\myfile.txt");
      if (dt.Rows.Count > 0)
      {
         Datagridview1.DataSource = null; // To clear the previous data before adding the new ones
         Datagridview1.DataSource = dt;
      }
   }
   catch (Exception ex)
   {
      MessagBox.Show(ex.Message, "Error");
   }

   progressbar1.Visible = false;
   panel1.Enabled = true;

}
Atrin Noori
  • 311
  • 3
  • 12