2

I have two pdf files and I want to compare those two pdf files and print the difference in messagebox.

So far I have this (but it is not working as I expect):

  private void button1_Click(object sender, EventArgs e)
    {
        string str1 = this.textBox1.Text;
        string str2 = this.textBox2.Text;

        string comparison = str1.Replace(str2,"");
        MessageBox.Show(comparison);
    }


 private void ParsePDF(string filePath)
    {
        string text = string.Empty;

        PdfReader reader = new iTextSharp.text.pdf.PdfReader(filePath);
        byte[] streamBytes = reader.GetPageContent(1);
        PRTokeniser tokenizer = new PRTokeniser(streamBytes);

        while (tokenizer.NextToken())
        {
            if (tokenizer.TokenType == PRTokeniser.TokType.STRING)
            {
                text += tokenizer.StringValue;
            }
        }
        this.textBox1.Text = text.ToString();
        this.textBox2.Text = text.ToString();
    }

}

and just below I call that method: ParsePDF("C://Users//lf222aw//Desktop//file1.pdf");

my program works like this: Suppose i have one textbox with text "I love stackoverflow" and the other textbox "I stackoverflow" and my program prints this as a result: "I love stackoverflow" and what i want to print is "love" as a difference between two those files

Any idea?? Regards,

4 Answers4

2

Check the Github repository in the link below.

google-diff-match-patch

It is an opensource library for strings' comparison. It is written in many languages included C#. You can calculate the delta diffs between two string values or text documents.

dsmyrnaios
  • 325
  • 1
  • 4
  • 12
1

If you split your files into words, you may be able to use something like:

    Dim str1 = New String() {"I", "love", "stackoverflow"}
    Dim str2 = New String() {"I", "stackoverflow"}
    Dim Diff = str1.Where(Function(x) Not str2.Contains(x)).ToArray()
U1199880
  • 907
  • 1
  • 10
  • 21
0

You are using String.Replace incorrectly. It is supposed to find all occurrences of string 2 in string 1 and replace it with the given text. In your example, you are attempting to find all I stackoverflow in I love stackoverflow but there is no match, which is why your program is still printing I love stackoverflow. Check out this SO post regarding string comparisons. How to find difference between two strings?

Community
  • 1
  • 1
Dave Zych
  • 21,581
  • 7
  • 51
  • 66
0

If I am reading your code correctly, you are writing the contents of 1 page to both textboxes.

Also with your replace statement it will never work because "I stackoverflow" is not present in "I love stackoverflow".

If you have "a b c" and "a c".

You will see that "a c" is not present in "a b c"

How to find difference between two strings?

Community
  • 1
  • 1
user1378730
  • 930
  • 9
  • 18