0

I need to write a JUnit test which will compare two csv files of same format and will pass only if their absolute difference is less than threshold. I need exact match for strings and for double,it should satisfy threshold criteria.

CSV FORMAT:

first.csv
 Name    price-1    price-2
 item1    5.12       6.12
 item2    4.23       5.56
 item3    11.2       12.23

second.csv

 Name    price-1     price-2
 item1    5.12       6.10
 item2    4.20       5.50
 item3    11.19      12.19

Now lets say difference threshold is 0.15. so here absolute difference between price1 of item2 in first.csv and second.csv is 0.03 then it will pass JUnit test and if difference threshold is 0.02 then it will fail.

what can be good solution for it?

2 Answers2

3

When you use assertEquals with double you can pass in a threshold. This is called the delta in junit speak.

Or you can use

assertTrue (Math.abs(val1 - val2) < threshold);

so in your example

price 2 is 6.12 and 6.10

in the first one you could use

assertEquals(6.12d, 6.10d, 0.15)

this would pass

or

assertEquals(Math.abs(6.12d - 6.10d) < 0.15)

this would pass.

I would recommend playing around with assertEquals and plugging in numbers so you understand the assertEquals overloaded methods

As you are reading from a file you are likely to read string. To get them in double then do

try {
    Double d1 = Double.parseDouble(str1);
    Double d2 = Double.parseDouble(str2);
    assertEquals(d1, d2, 0.15);
}catch (NumberFormatException e) {
    //not a number so cannot compare - perhaps call fail("fail msg here")
}
RNJ
  • 15,272
  • 18
  • 86
  • 131
  • Shouldn't you take the absolute value when subtracting the two? – Danny Aug 29 '12 at 19:06
  • @Danny ha ha :) - you added that comment as I was editing the post. Correct I have updated it to use Math.abs – RNJ Aug 29 '12 at 19:08
  • @RNJ When I read csv file it gives string array.Do i have to explicitly check it whether it is parsable to double or not.In my actual files,there are 240 columns,some of them are doubles and other strings.Thank you for JUnit info. – arpitMandliya Aug 29 '12 at 19:21
  • I would use Double.parseDouble() and catch the NumberFormatException. If an exception is thrown then it is not a double and so you cannot compare it. If no exception then you can use the assertEquals. @arpitMandliya I have just updated the answer to help – RNJ Aug 29 '12 at 19:23
2

You listed junit in the tag.

Junit's .equals(double, double, accuracy) allows you to specify how close they have to be with the last parameter.

I'd just read in the values and call .equals for each in a test...

or is there something to the question I'm not getting?

To parse the lines, your examples use spaces but you say "CSV" (Comma Separated). If they actually are CSV you could use something like:

String[] line = currentLine.split(",")

on each line. That would give you line[0]="item1", line[1]="5.12", line[2]="6.12"

After that try parsing line[1] and line[2] with Double.parseDouble()

By the way, use assertEquals, not assertTrue, the more specific assertEquals will display the value you wanted and the value you got as part of your error in the junit results.

I also recommend you pass in the optional string. The test line would look like this:

assertEquals("item "+file1.line[0]+" values do not match",
    Double.parseDouble(file1.line[1]),
    Double.parseDouble(file2.line[1]),
    0.001)

There is also the whole problem of making sure you are reading the same line for each file--getting them paired right. If they are guaranteed to be in the same order you are fine, but if not you might want to hash up the first file by the name field:

for(String line: file1.readNextLine()) 
    file1hash.put(line.split(",")[0],line)

Then as you iterate through the second file you can easily do:

for(String line2: file2.readNextLine())  {
    String line1=file1hash.get(line2.split(",")[0])

to make sure line1 and line2 refer to the same line.

Bill K
  • 62,186
  • 18
  • 105
  • 157
  • When I read csv file it gives string array.I have to check string equivalency for column1(name) and double accuracy(for other two columns) both at the same time.Do i have to explicitly check it whether it is parsable to double or not.In my actual files,there are 240 columns,some of them are doubles and other strings. – arpitMandliya Aug 29 '12 at 19:44