2

I have two files Which should contain the same values between Substring 0 and 10 though not in order. I have Managed to Outprint the values in each file but I need to Know how to Report say id the Value is in the first File and Notin the second file and vice versa. The files are in these formats.

6436346346....Other details
9348734873....Other details
9349839829....Other details

second file

8484545487....Other details
9348734873....Other details
9349839829....Other details

The first record in the first file does not appear in the second file and the first record in the second file does not appear in the first file. I need to be able to report this mismatch in this format:

Record 6436346346 is in the firstfile and not in the secondfile.
Record 8484545487 is in the secondfile and not in the firstfile.

Here is the code I currently have that gives me the required Output from the two files to compare.

package compare.numbers;

import java.io.*;

/**
 *
 * @author implvcb
 */
 public class CompareNumbers {

/**
 * @param args the command line arguments
 */
 public static void main(String[] args) {
    // TODO code application logic here
    File f = new File("C:/Analysis/");
    String line;
    String line1;
    try {
        String firstfile = "C:/Analysis/RL001.TXT";
        FileInputStream fs = new FileInputStream(firstfile);
        BufferedReader br = new BufferedReader(new InputStreamReader(fs));
        while ((line = br.readLine()) != null) {
            String account = line.substring(0, 10);
             System.out.println(account);


        }
        String secondfile = "C:/Analysis/RL003.TXT";
        FileInputStream fs1 = new FileInputStream(secondfile);
        BufferedReader br1 = new BufferedReader(new InputStreamReader(fs1));
        while ((line1 = br1.readLine()) != null) {
            String account1 = line1.substring(0, 10);
            System.out.println(account1);
        }

    } catch (Exception e) {
        e.fillInStackTrace();
    }



}
}

Please help on how I can effectively achieve this. I think I needed to say that am new to java and may not grab the ideas that easily but Am trying.

Stanley Mungai
  • 4,044
  • 30
  • 100
  • 168

6 Answers6

2

Here is the sample code to do that:

 public static void eliminateCommon(String file1, String file2) throws IOException
{
    List<String> lines1 = readLines(file1);
    List<String> lines2 = readLines(file2);

    Iterator<String> linesItr = lines1.iterator();
    while (linesItr.hasNext()) {
        String checkLine = linesItr.next();
        if (lines2.contains(checkLine)) {
            linesItr.remove();
            lines2.remove(checkLine);
        }
    }

    //now lines1 will contain string that are not present in lines2
    //now lines2 will contain string that are not present in lines1
    System.out.println(lines1);
    System.out.println(lines2);

}

public static List<String> readLines(String fileName) throws IOException
{
    List<String> lines = new ArrayList<String>();
    FileInputStream fs = new FileInputStream(fileName);
    BufferedReader br = new BufferedReader(new InputStreamReader(fs));
    String line = null;
    while ((line = br.readLine()) != null) {
        String account = line.substring(0, 10);
        lines.add(account);
    }
    return lines;
}
Ramesh PVK
  • 15,200
  • 2
  • 46
  • 50
  • Using sets would be more efficient for searches. – assylias Jul 09 '12 at 12:09
  • @Stanley At the end of eliminateCommon() you have comments. Both the lists contain unique id's. You can print in your own fashion. – Ramesh PVK Jul 09 '12 at 12:22
  • `System.out.println(lines1)` and `System.out.println(lines2)` does not outprint anything Either. – Stanley Mungai Jul 09 '12 at 12:39
  • Thank you Ramesh I think This is what I have been looking for, One last thing, The Output is in an Arrayin the form: [2632323236, 734343476, 34734343834],c an I get the Numbers each On its own Line? – Stanley Mungai Jul 09 '12 at 13:35
  • lines1 does return the records not in lines2 but linees2 returns even the records in the First file – Stanley Mungai Jul 09 '12 at 14:19
  • Could be whitespace issues. Try trimming the values before adding to the list. I did not get what do you mean by this "c an I get the Numbers each On its own Line? " – Ramesh PVK Jul 10 '12 at 04:36
2

Perhaps you are looking for something like this

Set<String> set1 = new HashSet<>(FileUtils.readLines(new File("C:/Analysis/RL001.TXT")));
Set<String> set2 = new HashSet<>(FileUtils.readLines(new File("C:/Analysis/RL003.TXT")));

Set<String> onlyInSet1 = new HashSet<>(set1);
onlyInSet1.removeAll(set2);

Set<String> onlyInSet2 = new HashSet<>(set2);
onlyInSet2.removeAll(set1);
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

If you guarantee that the files will always be the same format, and each readLine() function is going to return a different number, why not have an array of strings, rather than a single string. You can then compare the outcome with greater ease.

Nathan White
  • 1,082
  • 7
  • 21
1
  • Put values from each file to two separate HashSets accordingly.
  • Iterate over one of the HashSets and check whether each value exists in the other HashSet. Report if not.
  • Iterate over other HashSet and do same thing for this.
BenMorel
  • 34,448
  • 50
  • 182
  • 322
mmdemirbas
  • 9,060
  • 5
  • 45
  • 53
  • Even easier: `hashset1.removeAll(hashset2)`. All elements that remain, are singular. Then do the same thing in the other direction (of course with new sets). – brimborium Jul 09 '12 at 12:04
1

Open two Scanners, and :

    final TreeSet<Integer> ts1 = new TreeSet<Integer>();    
    final TreeSet<Integer> ts2 = new TreeSet<Integer>();
    while (scan1.hasNextLine() && scan2.hasNexLine) {
            ts1.add(Integer.valueOf(scan1.nextLigne().subString(0,10));
            ts1.add(Integer.valueOf(scan1.nextLigne().subString(0,10));
        }
You can now compare ordered results of the two trees

EDIT Modified with TreeSet

cl-r
  • 1,264
  • 1
  • 12
  • 26
1

Ok, first I would save the two sets of strings in to collections

Set<String> s1 = new HashSet<String>(), s2 = new HashSet<String>();
//...
while ((line = br.readLine()) != null) {
  //...
  s1.add(line);
}

Then you can compare those sets and find elements that do not appear in both sets. You can find some ideas on how to do that here.

If you need to know the line number as well, you could just create a String wrapper:

class Element {
  public String str;
  public int lineNr;

  public boolean equals(Element compElement) {
    return compElement.str.equals(str);
  }
}

Then you can just use Set<Element> instead.

Community
  • 1
  • 1
brimborium
  • 9,362
  • 9
  • 48
  • 76