I have two large (10M lines) files, both data files. Each line contains a number of fields, the last 3 fields give the x, y, z position To check my random generator, I want to be sure that there is not a single line in one file with a position identical to any line in the second file. The only thing that occured to me is something like
loop over file1
read file1: eventnr1 energy1 posX1 posY1 posZ1
loop over file2
read file2: eventnr2 energy2 posX2 posY2 posZ2
if ( fabs(posX1 - posX2) < 0.00001 && fabs(posY1 - posY2) < 0.00001 etc...)
Of course, this is very time-consuming (I tried both a bash script and a C++ program, I am not sure which will be faster). Does anyone know of a smarter (faster) way?
To be clear, the files might be completely different except for one or two lines. Using UNIX "diff" would not work (too large files).
Best regards,
Machiel