0
static void Main(string[] args) 
{

    string TheDataFile = "";
    string ErrorMsg = "";
    string lngTransDate = "";
    ProcessDataFile  ProcessTheDataFile = new ProcessDataFile();

    string TheFile = "S:\\MIS\\Provider NPI file\\Processed\\npidata_20050523-20161009.csv";
    string[] lines = File.ReadAllLines(TheFile, Encoding.UTF8);//Read all lines to an array 
    Console.WriteLine(lines.Length.ToString());
    Console.ReadLine();
}

This throws an error because the file is very large (has 6 million lines). Is there a way to handle large files and count the # of lines?

LarsTech
  • 80,625
  • 14
  • 153
  • 225
  • Read it line by line. See this [example](https://stackoverflow.com/questions/33515571). – Han Jul 19 '17 at 15:29
  • If you just want to get the line count, stream it and loop through line by line to get the count. That way you aren't holding the whole thing in memory. – LarsTech Jul 19 '17 at 15:29
  • [Maybe this post can help](https://stackoverflow.com/questions/23989677/file-readalllines-or-stream-reader) – blaze_125 Jul 19 '17 at 15:30
  • BTW, you don't need the `.ToString()` on `lines.Length`... `Console.WriteLine` will handle integers just fine. – Matthew Whited Jul 19 '17 at 15:44

3 Answers3

4

Use a StreamReader:

string TheFile = "S:\\MIS\\Provider NPI file\\Processed\\npidata_20050523-20161009.csv";
int count = 0;
using (System.IO.StreamReader sr = new System.IO.StreamReader(TheFile))
{
    while (sr.ReadLine() != null)
        count++;
}
mm8
  • 163,881
  • 10
  • 57
  • 88
2

You need to do a lazy evaluation of the file so it isn't loaded into memory entirelly.

Helper method

public static class ToolsEx
{
    public static IEnumerable<string> ReadAsLines(this string filename)
    {
        using (var streamReader = new StreamReader(filename))
            while (!streamReader.EndOfStream)
                yield return streamReader.ReadLine();
    }
}

Usage

var lineCount = "yourfile.txt".ReadAsLines().Count();
Matthew Whited
  • 22,160
  • 4
  • 52
  • 69
  • 1
    Personally I would call the method `EnumerateLines` so it follows the `Directory.GetFiles()`/`Directory.EnumerateFiles()` pattern. Also not make it a extension method of a `string`, doing `"yourfile.txt".ReadAsLines()` looks weird to me. – Scott Chamberlain Jul 19 '17 at 15:35
  • That's your opinion... and I disagree with it :) The intent is to read the file. It's also easier for non-programmers to understand what `Read` versus `Enumerate`. (In my opinion) – Matthew Whited Jul 19 '17 at 15:39
  • 1
    I would agree with @ScottChamberlain Not all strings are files. That's the weird part. – LarsTech Jul 19 '17 at 15:45
  • If it makes you feel better then use it like this...`var lineCount = ToolsEx.ReadAsLines(filename:"yourfile.txt").Count()` – Matthew Whited Jul 19 '17 at 15:50
  • I'm with @ScottChamberlain regarding `string` extension method. Also there is no need of such helper method at all since it replicates [`File.ReadLines`](https://msdn.microsoft.com/en-us/library/dd383503(v=vs.110).aspx) – Ivan Stoev Jul 19 '17 at 16:17
2

According to this already accepted answer, this should do it.

using System;
using System.IO;

namespace CountLinesInFiles_45194927
{
    class Program
    {
        static void Main(string[] args)
        {
            int counter = 0;
            foreach (var line in File.ReadLines("c:\\Path\\To\\File.whatever"))
            {
                counter++;
            }
            Console.WriteLine(counter);
            Console.ReadLine();
        }
    }
}
blaze_125
  • 2,262
  • 1
  • 9
  • 19
  • @ScottChamberlain, Using a 1.6GB datafile, the VSHost process, using OP's way, used over 5GB of memory. My way did not go over 14mb. Not to mention that OP's way also took longer to process. – blaze_125 Jul 19 '17 at 16:01