2

I have below code where i am reading the file from particular directory, processing it and once processed i am moving the file to archive directory. This is working fine. I am receiving new file everyday and i am using Control-M scheduler job to run this process.

Now in next run i am reading the new file from that particularly directory again and checking this file with the file in the archive directory and if the content is different then only process the file else dont do anything. There is shell script written to do this job and we dont see any log for this process.

Now i want to produce log message in my java code if the files are identical from the particular directory and in the archive directory then generate log that 'files are identical'. But i dont know exactly how to do this. I dont want to write the the logic to process or move anything in the file ..i just need to check the files are equal and if it is then produce log message. The file which i recieve are not very big and the max size can be till 10MB.

Below is my code:

        for(Path inputFile : pathsToProcess) {
            // read in the file:
            readFile(inputFile.toAbsolutePath().toString());
            // move the file away into the archive:
            Path archiveDir = Paths.get(applicationContext.getEnvironment().getProperty(".archive.dir"));
            Files.move(inputFile, archiveDir.resolve(inputFile.getFileName()),StandardCopyOption.REPLACE_EXISTING);
        }
        return true;
    }

    private void readFile(String inputFile) throws IOException, FileNotFoundException {
        log.info("Import " + inputFile);

        try (InputStream is = new FileInputStream(inputFile);
                Reader underlyingReader = inputFile.endsWith("gz")
                        ? new InputStreamReader(new GZIPInputStream(is), DEFAULT_CHARSET)
                        : new InputStreamReader(is, DEFAULT_CHARSET);
                BufferedReader reader = new BufferedReader(underlyingReader)) {

            if (isPxFile(inputFile)) {
                Importer.processField(reader, tablenameFromFilename(inputFile));
            } else {
                Importer.processFile(reader, tablenameFromFilename(inputFile)); 
            }

        }
        log.info("Import Complete");
    }       

}
Andrew
  • 3,632
  • 24
  • 64
  • 113
  • This should help: https://stackoverflow.com/questions/27379059/determine-if-two-files-store-the-same-content – A_C Jan 24 '20 at 09:15
  • actually i already checked this question before i post it but i really dont understand where i should i make changes in the code as i am not much familiar with the file handling – Andrew Jan 24 '20 at 09:20
  • It seems to me that you are reading multiple files from the input directory. Is that what you want? ---> for(Path inputFile : pathsToProcess) { – A_C Jan 24 '20 at 09:27
  • Also, what is the Importer class doing? Code? – A_C Jan 24 '20 at 09:33
  • yes there are multiple files but i am processing one by one ...the importer class is inserting the data from this file to oracle database table – Andrew Jan 24 '20 at 10:44
  • All you need to do is in the 'for loop', you need to read the same file from both directories, compare them if content is same, and if not, then move the new file in the archive after processing it. I would say use divide-and-conquer, make some methods which each do tiny bit of the whole job, i.e. the steps above. – A_C Jan 24 '20 at 10:53
  • but i have already for loop where i am readin the directory ...for (Path fileToWork : directoryStream) ...for(Path inputFile : pathsToProcess) and currently its reading multiple files also one by one..i think i just need to apply if condition to check if the file from that particular directory has same content or not with the file from archive directory ? During this comparision i dont want to process the file or move anything ...i just want to produce a log message that file has same content..i will edit the question accordingly now – Andrew Jan 24 '20 at 11:05
  • yes I am talking about the same ‘for’ loop. For comparing files the link that I shared earlier will help. – A_C Jan 24 '20 at 11:20
  • ok can you please help how my if condition looks and where do i need to place in the code as answer to this question to make it work as i am sure and if i messed up then i will not be able to test correctly – Andrew Jan 24 '20 at 11:22

1 Answers1

1

Based on the limited information about the size of file or performance needs, something like this can be done. This may not be 100% optimized, but just an example. You may also have to do some exception handling in the main method, since the new method might throw an IOException:

import org.apache.commons.io.FileUtils;  // Add this import statement at the top


// Moved this statement outside the for loop, as it seems there is no need to fetch the archive directory path multiple times.
Path archiveDir = Paths.get(applicationContext.getEnvironment().getProperty("betl..archive.dir"));  

for(Path inputFile : pathsToProcess) {

    // Added this code
    if(checkIfFileMatches(inputFile, archiveDir); {
        // Add the logger here.
    }
    //Added the else condition, so that if the files do not match, only then you read, process in DB and move the file over to the archive. 
    else {
        // read in the file:
        readFile(inputFile.toAbsolutePath().toString());
        Files.move(inputFile, archiveDir.resolve(inputFile.getFileName()),StandardCopyOption.REPLACE_EXISTING);
    }       
}


//Added this method to check if the source file and the target file contents are same.
// This will need an import of the FileUtils class. You may change the approach to use any other utility file, or read the data byte by byte and compare. If the files are very large, probably better to use Buffered file reader.
    private boolean checkIfFileMatches(Path sourceFilePath, Path targetDirectoryPath) throws IOException {
        if (sourceFilePath != null) {  // may not need this check
            File sourceFile = sourceFilePath.toFile();
            String fileName = sourceFile.getName();

            File targetFile = new File(targetDirectoryPath + "/" + fileName);

            if (targetFile.exists()) {
                return FileUtils.contentEquals(sourceFile, targetFile);
            }
        }
        return false;
    }
A_C
  • 905
  • 6
  • 18
  • thanks Ankur i will check ...just to let you know that my file size can be max 10MB size which is not big so this approach is good ? – Andrew Jan 24 '20 at 12:10
  • i will check the performance then and one more thing what change i need here in the code as you mentioned... for(Path inputFile : pathsToProcess) – Andrew Jan 24 '20 at 12:11
  • Check the comments alongside the code, I have tried to make it clear what changes I made – A_C Jan 24 '20 at 12:15
  • @Andrew , I have modified the code to add the else condition. Cross-check it now. – A_C Jan 24 '20 at 12:19
  • i have updated the code but somehow the files are not getting checked as identical ..so i placed the same file for archive and the processing directory and try to run the code but it didnt throw the log message which i have provided as identical and it started importing the data – Andrew Jan 24 '20 at 12:31
  • Add loggers in the new method and check what is the value for sourceFilePath, fileName and targetFile , whether they are being correctly set? – A_C Jan 24 '20 at 12:33
  • yes i just did it ..actually after the dbimport has finished it has thrown the log message as Both File contents are same...normally i think this should came in the begining itself and does not let import start ? – Andrew Jan 24 '20 at 12:35