0

I am currently using the below code to compare two csv files with each other. This code gives an output with all the rows that are not the same. But when a row is missing everything after that row is not the same. How can I fix this? Thanks in advance.

List<string> lines = new List<string>();
List<string> lines2 = new List<string>();

try
{
    StreamReader reader = new StreamReader(System.IO.File.OpenRead(file1));
    StreamReader read = new StreamReader(System.IO.File.OpenRead(file2));
    List<string> differences = new List<string>();
    string line;
    string line2;

    int i = 0;
    while ((line = reader.ReadLine()) != null && (line2 = read.ReadLine()) != null)
    {
        string[] split = line.Split(Convert.ToChar("\t"));
        string[] split2 = line2.Split(Convert.ToChar("\t"));

        if (split[i] != split2[i])
        {
            differences.Add("this row is not the same:, " + line);
        }
        else
        {
        }
        i++;
    }
    System.IO.File.WriteAllLines(differencesFile, differences);
    reader.Dispose();
    read.Dispose();
}
catch
{
}
erikvimz
  • 5,256
  • 6
  • 44
  • 60
Mylan
  • 115
  • 2
  • 10
  • 1
    Have a look at *edit distance*, https://en.wikipedia.org/wiki/Edit_distance e.g. https://en.wikipedia.org/wiki/Levenshtein_distance – Dmitry Bychenko Dec 19 '17 at 13:11
  • What means "a row is missing"? You can skip empty lines – Tim Schmelter Dec 19 '17 at 13:12
  • also what is that _row_ variable used to index the arrays?. This code seems to be incomplete. Not to mention the stray curly brace in the middle of the if statement – Steve Dec 19 '17 at 13:13
  • I am going to use this code to compare two csv files with articles from a website and when a product is missing in the second file it means its not on the website. So when a row is missing it means a certain product is not in that file – Mylan Dec 19 '17 at 13:13
  • @Mylan Can you post those two `.csv` files? – erikvimz Dec 19 '17 at 13:17
  • Productcode EAN Product description Brand SRP Dealer price Stock Status Category level 1 Category level 2 those are the headers – Mylan Dec 19 '17 at 13:20
  • Why *split* if you want to compare lines? Repeating tabs means empty fields, so they can't be ignored. – Panagiotis Kanavos Dec 19 '17 at 13:23
  • @Mylan: now you have edited your code but it doesn't make sense. You increae `i` for every line but `i` is the iterator for every column(splited by tab). You need a for-loop in the while-loop. – Tim Schmelter Dec 19 '17 at 13:29
  • "I am going to use this code to compare two csv files with articles from a website and when a product is missing in the second file it means its not on the website." - Just a thought: Comparing the whole line also means price, stock and status must be equal. Wouldn't it make more sense to just compare for example EAN column (assuming all products have that column filled)? – Fildor Dec 19 '17 at 13:38
  • Yes thats what I want eventually – Mylan Dec 19 '17 at 13:44
  • I would try to solve this using OleDB or one of the CSV readers mentioned in the top answer to this post: https://stackoverflow.com/questions/6813607/parsing-csv-using-oledb-using-c-sharp . That an option? – LocEngineer Dec 19 '17 at 14:01
  • Is there something wrong with `'\t'`? – NetMage Dec 19 '17 at 19:07

1 Answers1

0

After help from a friend I made it work with this code:

 List<string> file1 = new List<string>();
        List<string> output = new List<string>();
        string differencesFile = path;
        File.WriteAllText(differencesFile, "");
        try
        {
            StreamReader readFile1 = new StreamReader(System.IO.File.OpenRead(pathfile1));

            string lineFile1;
            while ((lineFile1 = readFile1.ReadLine()) != null)
            {
                bool match = false;
                string[] colums = lineFile1.Split('\t');

                StreamReader readFile2 = new StreamReader(System.IO.File.OpenRead(pathfile2));
                string line2;
                while ((line2 = readFile2.ReadLine()) != null)
                {
                    string[] columsFile2 = line2.Split('\t');
                    if (colums[0] == columsFile2[0])
                    {
                        match = true;
                    }
                }
                if (!match)
                {

                    output.Add(colums[0] + "; doesnt exist in pathfile2");


                }
            }
            System.IO.File.WriteAllLines(differencesFile, output);

        }
        catch { }
Mylan
  • 115
  • 2
  • 10