0

Consider 1000 files in the folder.Now i want to find the same file in that folder.

I tired it by byte by byte comparison but it took long time to finish. This is the code

fs1=new BufferedInputStream(new FileInputStream(file1));
fs2=new BufferedInputStream(new FileInputStream(file2));

int b1,b2;
do
{
b1=fs1.read();
b2=fs2.read();
    if(b1!=b2)
    {
        match=false;    
        break;
    }
}while(found && b1 !=- 1);

if(match)
{
    Log.e("cyb", "Matched");
}

Any other method to find the same file?

Satheesh Kumar
  • 716
  • 5
  • 13

1 Answers1

3

First thing you should do to optimize your code is to check sizes of the files you compare. If the sizes are not the same then there is no point in reading the files into memory and comparing them byte by byte.

Another thing you can do is to compute a CRC for each of the file first, and then do the actual comparison only for files which have the same CRC (and the same length). This should greatly limit the number of your expensive byte-by-byte comparisons if you are dealing with many different files of the same length.

piokuc
  • 25,594
  • 11
  • 72
  • 102
  • What method did you use? Have a look at answers to this question: http://stackoverflow.com/questions/116574/java-get-file-size-efficiently – piokuc Dec 19 '12 at 15:17
  • Also, you should make sure you get the size of each file once only (cache it in a map, for example) – piokuc Dec 19 '12 at 15:18
  • Ok bro. I will try it and tell you. – Satheesh Kumar Dec 19 '12 at 15:23
  • Curious about your CRC idea - is there a precomputed, stored CRC somehow available that saves having to read the file to create one? Unless there is, that seems inefficient, though the file size idea is of course important. Note that you do not have to compare byte-by-byte, you can cast pointers to word size units and compare those, until you get to a possible remainder at the end. – Chris Stratton Dec 19 '12 at 15:27
  • I don't think there is a precomputed CRC stored anywhere (in the file system?) which you can access from Java. Anyway, the method can be good if you happen to deal with many different files of the same length - so it depends on the data. – piokuc Dec 19 '12 at 15:32