0

I have 10 text files ("C_1.txt" to "C_10.txt") that look like this

 3   22   34   55   65   
 9   19    0   47   62   
10   28   40   54   72   
15   23   31   52   61

and I need to compare them all to another file ("template.txt") that looks like this

 0   0   0   0   0   
 0   0   0   0   0   
 0   0   0   0   0   
 0   0   0   0   0

once I have found one that looks the same I need to print out the file name as well as the names of the the other files that look the most similar in there respective order

At first I thought about turning the into arrays and comparing them but it didn't work well so I was wondering if there was a better way to comparing them all at the same time

Andrew Regan
  • 5,087
  • 6
  • 37
  • 73
Jorge
  • 1
  • 3
    first get a complete and clear analysis. how do you know one is "the same"? same amount of rows and columns? every value the same? how do you decide the order in which they are the same? the most matching elements? When you get that, implement it – Stultuske Sep 12 '19 at 13:14
  • You could read the lines of each file and compare them to the ones of the other files. – deHaar Sep 12 '19 at 13:19
  • Im focusing on maching elements because all the files share the same structure, 5 row, 5 columns, and a number in each space – Jorge Sep 12 '19 at 13:21
  • 2
    "I thought about turning them into arrays and comparing them but it didn't work well..." That is likely the right approach. Explain what didn't work well and we may be able to help. – swpalmer Sep 12 '19 at 13:34
  • 1
    How is "similar" defined in this context? Is there a specific algorithm to compute similarity? For example, are the spaces between the numbers important, or just the values themselves? – swpalmer Sep 12 '19 at 13:35
  • And note: https://meta.stackoverflow.com/questions/284236/why-is-can-someone-help-me-not-an-actual-question – GhostCat Sep 12 '19 at 13:57

1 Answers1

0

I recommend you filter each file and then compare the filtered output to each template. If an input file is to contain only non-negative integers, then you can use a simpler Pattern class object replace all matches with '0' and all non-matches with nothing. Then your two files will be equal.

Something like the following:

import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.regex.Pattern;
public class Change {
    static Pattern changeNumbers = Pattern.compile("[0-9]+", Pattern.MULTILINE);
    static Pattern changeTabsBlanks = Pattern.compile("[     ]+", Pattern.MULTILINE);
    public static String normalizeIntegerString (String str) {
        str = changeNumbers.matcher(str).replaceAll("0");
        str = changeTabsBlanks.matcher(str).replaceAll("");
        return str;
    }

    public static void emitString (String strname, String str) {
        System.err.println(String.format("%s length is %d", strname, str.length()));
        System.err.println(str);
        System.err.println("");
    }

    public static void main (String args[]) throws Exception {
        String patternFilename = args[0];
        String inputFilename = args[1];
        String patternFileContent = new String(Files.readAllBytes(Paths.get(patternFilename)));
        String inputFileContent = new String(Files.readAllBytes(Paths.get(inputFilename)));

        String filteredInputFileContent = normalizeIntegerString(inputFileContent);
        String filteredPatternFileContent = normalizeIntegerString(patternFileContent);
        emitString("filteredInputFileContent", filteredInputFileContent);
        emitString("filteredPatternFileContent", filteredPatternFileContent);
        if (filteredInputFileContent.contentEquals(filteredPatternFileContent))
            System.out.println("match");
        else
            System.out.println("mismatch");
    }

}
Jeff Holt
  • 2,940
  • 3
  • 22
  • 29