I must do an application, that compares some very big csv
files, each one having 40,000 records. I have done an application, that works properly, but it spends a lot of time in doing that comparison, because the two files could be disordenated or have different records - for that I must iterate (40000^2)*2 times.
Here is my code:
if (nomFich.equals("CAR"))
{
while ((linea = br3.readLine()) != null)
{
array =linea.split(",");
spliteado = array[0]+array[1]+array[2]+array[8];
FileReader fh3 = new FileReader(cadena + lista2[0]);
BufferedReader bh3 = new BufferedReader(fh3);
find=0;
while (((linea2 = bh3.readLine()) != null))
{
array2 =linea2.split(",");
spliteado2 = array2[0]+array2[1]+array2[2]+array2[8];
if (spliteado.equals(spliteado2))
{
find =1;
}
}
if (find==0)
{
bw3.write("+++++++++++++++++++++++++++++++++++++++++++");
bw3.newLine();
bw3.write("Se han incorporado los siguientes CGI en la nueva lista");
bw3.newLine();
bw3.write(linea);
bw3.newLine();
aparece=1;
}
bh3.close();
}
I think that using a Set
in Java is a good option, like the following post suggests:
Comparing two csv files in Java
But before I try it this way, I would like to know, if there are any better options.
Thanks for all.