1

I have a simple JUnit test which checks two files have the same content. It works perfectly fine in my Unix laptop.

Here it is the test:

    boolean response = false;
    try {
      File got = File.createTempFile("got-", ".csv");
      String outputPath = got.getAbsolutePath();
      testedObject.createCsvFile(outputPath);
      got = new File(outputPath);
      String expectedFilePath = getClass().getClassLoader().getResource("expected.csv").getFile();
      File expected = new File(expectedFilePath);
      response = FileUtils.contentEquals(got, expected); // Here it is the key
    } catch (IOException e) {
      // Nothing to do Yay!
    }
    Assert.assertTrue(response);

It works because if I compare both files manually, example via diff command, are exactly the same. Now.
My teem-mate codes with a Windows laptop, when he ran the test it brokes down! and we started debugging.

Visually, both files are the same; I mean in a human revision you cannot realize any difference. But If in a Cwin terminal we executed: diff expected.csv got.csv and windows thought each line was different
And the test falls.

What is the problem, is the operative system? If that is true, Is there any way to compare file content not dependent on operative system

Manu Artero
  • 9,238
  • 6
  • 58
  • 73

3 Answers3

1

My guess is that this is most likely this is due to the \n value, which in unix like software is \r\n.

Anyway, the correct way to see if two files have the same content, is to hash both of them (ie via sha1) and check if the hashes matches!

Andrea
  • 495
  • 5
  • 15
  • Liked your answer very much. Could you explain how to compare both file hashes? – Manu Artero May 04 '15 at 11:57
  • 1
    I'm no java expert =) but apparently this answer is what you want http://stackoverflow.com/questions/6293713/java-how-to-create-sha-1-for-a-file then just compare the two hash values – Andrea May 04 '15 at 12:07
  • "the correct way [...] check if the hashes match" Nope! The only thing that you can assert with such a check is wether both files _are not equal_. If the hash values are equal, you cannot be sure that the files equal, too (although they most likely do). This is not the correct way. – Seelenvirtuose May 04 '15 at 12:23
  • Hash collisions are a reality, true, but using a fairly strong hash algorithm is enough – Andrea May 04 '15 at 12:24
0

This behaviour can be attributed to the Line Feed being different on both operating systems. If you want it to be platform independent , you should pick up the value from the runtime using

System.getProperty("line.separator");

Also you might want to have a look at the char encoding for both the files

Akash Yadav
  • 2,411
  • 20
  • 32
0

This answer can help you: Java Apache FileUtils readFileToString and writeStringToFile problems. The question's author is talking about PDF file, but this answer makes sense for your question.

Community
  • 1
  • 1
wrenzi
  • 162
  • 1
  • 8