1

I have a text file that I am reading using TextFieldParser class in c#. This file has CRLF as the newline character that I can see using Notepad++. Few lines of this file have LF as the newline character.

I need to get the count of maximum occurrence between those newline characters and then replace the least used with blank so the file has the same new line character.

Here is my code so far,

if (File.Exists(path))
{
    List<string> delimiters = new List<string> { ";", "-", ",", "|" };
    List<string> linebreakchars = new List<string> { "\r", "\r\n", "\n"};
    Dictionary<string, int> counts = delimiters.ToDictionary(key => key, value => 0);
    Dictionary<string, int> countNewLineChars = linebreakchars.ToDictionary(key => key, value => 0);
    int counter = 0;
    int counterLine = 0;
    string line;
    // Read the file and display it line by line.
    System.IO.StreamReader file =
        new System.IO.StreamReader(path);
    while ((line = file.ReadLine()) != null)
    {
        foreach (string c in delimiters)
            counts[c] = line.Count(t => t == Convert.ToChar(c));
        counter++;
        foreach(string ln in linebreakchars)
            countNewLineChars[ln] = line.Count(t => t.ToString() == ln);
        counterLine++;
    }
    var delimiter = counts.Aggregate((l, r) => l.Value > r.Value ? l : r).Key;
    var newLineChar = countNewLineChars.Aggregate((l, r) => l.Value > r.Value ? l : r).Key;
    string text = File.ReadAllText(path);
    file.Close();
    switch (newLineChar)
    {
        case "\r":
            text = Regex.Replace(text, @"(?<!\r)\n+", "");
            break;
        case "\r\n":
            text = Regex.Replace(text, @"(?<!\r)\n+", "");
            break;
        case "\n":
            text = Regex.Replace(text, @"(?<!\n)\r\n+", "");
            break;
    }
    File.WriteAllText(path, text);
}

It doesn't count any occurrence of the line break characters.

What am I doing wrong and how do I get the count of all newline characters of my file?

AMeh
  • 211
  • 2
  • 6
  • 18
  • `StreamReader.ReadLine()` will strip the line break characters. Try `StreamReader.ReadToEnd()` to get one string returned (though be aware if the file is large you can run out of memory). – Tim Feb 03 '16 at 16:52
  • That's the issue. I do have some very large files. – AMeh Feb 03 '16 at 17:01
  • does it matter what the new line character ends up being, really? why not write out a new line in a new file for each line you read (StreamWriter)? Boom! automatic Homogenization with one read and one write operation per line. Run a replace on those characters with each read if you are paranoid. – DaFi4 Feb 03 '16 at 17:57
  • File.ReadAllText does almost the same thing as StreamReader.ReadLine() anyway. You could also check the result of that for your new line characters. If that is slow, see my suggestion above. Rewrite your code to just read the lines and write them out in a new StreamWriter. You can specify the NewLine property on the StreamWriter to be your favorite and it will perform faster. (dont forget to delete the old file later) – DaFi4 Feb 03 '16 at 18:04

0 Answers0