1

I want to create a Quartz job which reads .csv files and moves them when file is processed. I tried this:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    try {
        Files.createDirectory(Path.of(directoryPath.getAbsolutePath() + "/processed"));
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");

    Optional<File> csvFile = Arrays.stream(filesList).findFirst();
    File file = csvFile.get();
  
    for(File file : filesList) {

        try {
            List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))
                    .....
                    .build()
                    .parse();

            for(CsvLine item: beans){

                    ....... sql queries

                    Optional<ProcessedWords> isFound = processedWordsService.findByKeyword(item.getKeyword());

                    ......................................
            }

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

Folder processed is created when the job is started but I get exception:

        2022-11-17 23:12:51.470 ERROR 16512 --- [cessor_Worker-4] org.quartz.core.JobRunShell              : Job DEFAULT.keywordPostJobDetail threw an unhandled Exception: 

java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
    at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:127) ~[main/:na]
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.2.jar:na]
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.2.jar:na]
Caused by: java.nio.file.FileSystemException: C:\csv\nov\11_42_33.csv -> C:\csv\nov\processed\11_42_33.csv: The process cannot access the file because it is being used by another process
    at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[na:na]
    at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[na:na]
    at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:403) ~[na:na]
    at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) ~[na:na]
    at java.base/java.nio.file.Files.move(Files.java:1432) ~[na:na]
    at com.wordscore.engine.processor.ImportCsvFilePostJob.execute(ImportCsvFilePostJob.java:125) ~[main/:na]
    ... 2 common frames omitted

Do you know how I can release the file and move it into a sub directory?

EDIT: Update code with try-catch

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    try {
        Path path = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(path) || !Files.isDirectory(path)) {
            Files.createDirectory(path);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    
    Optional<File> csvFile = Arrays.stream(filesList).findFirst();
    File file = csvFile.get();
     
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            Files.move(originalPath, copied, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    
}

Quartz config:

@Configuration
public class SchedulerConfig {

    private static final Logger LOG = LoggerFactory.getLogger(SchedulerConfig.class);

    private ApplicationContext applicationContext;

    @Autowired
    public SchedulerConfig(ApplicationContext applicationContext) {
        this.applicationContext = applicationContext;
    }

    @Bean
    public JobFactory jobFactory() {
        AutowiringSpringBeanJobFactory jobFactory = new AutowiringSpringBeanJobFactory();
        jobFactory.setApplicationContext(applicationContext);
        return jobFactory;
    }

    @Bean
    public SchedulerFactoryBean schedulerFactoryBean(Trigger simpleJobTrigger) throws IOException {

        SchedulerFactoryBean schedulerFactory = new SchedulerFactoryBean();
        schedulerFactory.setQuartzProperties(quartzProperties());
        schedulerFactory.setWaitForJobsToCompleteOnShutdown(true);
        schedulerFactory.setAutoStartup(true);
        schedulerFactory.setTriggers(simpleJobTrigger);
        schedulerFactory.setJobFactory(jobFactory());
        return schedulerFactory;
    }

    @Bean
    public SimpleTriggerFactoryBean simpleJobTrigger(@Qualifier("keywordPostJobDetail") JobDetail jobDetail,
                                                     @Value("${simplejob.frequency}") long frequency) {
        LOG.info("simpleJobTrigger");

        SimpleTriggerFactoryBean factoryBean = new SimpleTriggerFactoryBean();
        factoryBean.setJobDetail(jobDetail);
        factoryBean.setStartDelay(1000);
        factoryBean.setRepeatInterval(frequency);
        factoryBean.setRepeatCount(4); //         factoryBean.setRepeatCount(SimpleTrigger.REPEAT_INDEFINITELY);
        return factoryBean;
    }

    @Bean
    public JobDetailFactoryBean keywordPostJobDetail() {
        JobDetailFactoryBean factoryBean = new JobDetailFactoryBean();
        factoryBean.setJobClass(ImportCsvFilePostJob.class);
        factoryBean.setDurability(true);
        return factoryBean;
    }

    public Properties quartzProperties() throws IOException {
        PropertiesFactoryBean propertiesFactoryBean = new PropertiesFactoryBean();
        propertiesFactoryBean.setLocation(new ClassPathResource("/quartz.properties"));
        propertiesFactoryBean.afterPropertiesSet();
        return propertiesFactoryBean.getObject();
    }
}

Quartz config:

org.quartz.scheduler.instanceName=wordscore-processor
org.quartz.scheduler.instanceId=AUTO
org.quartz.threadPool.threadCount=5
org.quartz.jobStore.class=org.quartz.simpl.RAMJobStore

As you can see I wan to have 5 threads in order to execute 5 parallel jobs. Do you know how I can process the files without this exception?

Peter Penzov
  • 1,126
  • 134
  • 430
  • 808
  • Does https://stackoverflow.com/questions/4645242/how-do-i-move-a-file-from-one-location-to-another-in-java and/or https://stackoverflow.com/questions/300559/move-copy-file-operations-in-java answers your question? – Reporter Nov 17 '22 at 12:37
  • I saw the code examples. You idea is to copy-paste the code and then to delete the source file? – Peter Penzov Nov 17 '22 at 12:40
  • Over both question there are three methods to achive your goal. Moving a file is an ordinary copy and delete it. The second offered way is using `renameTo` method. The third offered way is using the move method. But there also two tiny hints, put it into comment section, you should be insure all object are closed, bevor you move files. – Reporter Nov 17 '22 at 13:07
  • I think the source of your trouble is following line: "C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed ". The concatination of `Paths.get(file.getAbsolutePath() + "/processed");` is wrong – Reporter Nov 17 '22 at 13:09

5 Answers5

3
new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)

This parts creates a resource. A resource is an object that represents an underlying heavy thing - a thing that you can have very few of. In this case, it represents an underlying OS file handle.

You must always safely close these. There are really only 2 ways to do it correctly:

  • Use try-with-resources
  • Save it to a field, and make yourself AutoClosable so the code that uses of instances of this class can use try-with-resources
try (var br = new FileReader(file, StandardCharsets.UTF_16)) {
  List<CsvLine> beans = new CsvToBeanBuilder(br)
                    .....
                    .build()
                    .parse();
}

Is the answer.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
  • I updated my code to use `try-catch` but still I get the error. Looks like I need to close the file manually at the end when it's processed. Can you show me how this can be implemented, please? – Peter Penzov Nov 16 '22 at 13:13
  • 1
    You need to update your code to use try-with, as I showed in the snippet; `catch` has nothing to do with it. try-with __will__ close the resource once codeflow exits the braces that go with it - no matter how it exits (run to the end, `return;`, `break;`, throw an exception - no matter). This, is it. If you still get this error, either [A] another process on your machine has it open or [B] you didn't do what I said you should. – rzwitserloot Nov 16 '22 at 14:33
  • Sounds like another process (or other code where you didn't apply try-with or otherwise haven't closed the resource) also has that file open then. You can't force-close the other app from within yours. – rzwitserloot Nov 16 '22 at 15:33
  • I tested this basic JUnit test just to be sure that other process is not locking the file: https://pastebin.com/KsWkysyF I got the same error. Any other ideas? – Peter Penzov Nov 16 '22 at 22:10
2

Although I agree completely with the answer and comments of @rzwitserloot, note the following in your error stack trace:

java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process

You are trying moving your file to the backup directory, but note you are doing it to the wrong path, C:\csv\nov\07_06_26.csv\processed, in the example.

Please, try the following:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

If you need to increase the throughput of files processed, you could try splitting them in batches, say for certain pattern in their name like a month name or a job number, for instance. The simple solution could be to use the provided JobExecutionContext of every job to include some split criteria. That criteria will be used in your FilenameFilter causing every job to process only a certain portion of the whole amount of files that need to be processed. I think the solution is preferable to any kind of locking or similar mechanism..

For example, consider the following:

@Override
public void execute(JobExecutionContext context) {

    File directoryPath = new File("C:\\csv\\nov");
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    // We obtain the file processing criteria using a job parameter
    JobDataMap data = context.getJobDetail().getJobDataMap();
    String filenameProcessingCriteria = data.getString("FILENAME_PROCESSING_CRITERIA");
    // Use the provided criteria to restrict the files that this job
    // will process 
    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv") && lowercaseName.indexOf(filenameProcessingCriteria) > 0) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

You need to pass the required parameter to your jobs:

JobDetail job1 = ...;
job1.getJobDataMap().put("FILENAME_PROCESSING_CRITERIA", "job1pattern");

An even simpler approach, based on the same idea, could be splitting the files in different folders and pass the folder name that need to be processed as a job parameter:

@Override
public void execute(JobExecutionContext context) {

    // We obtain the directory path as a job parameter
    JobDataMap data = context.getJobDetail().getJobDataMap();
    String directoryPathName = data.getString("DIRECTORY_PATH_NAME");

    File directoryPath = new File(directoryPathName);
    // Create a new subfolder called "processed" into source directory
    // Hold a reference to the processed files directory path, we will
    // use it later
    Path processedDirectoryPath;
    try {
        processedDirectoryPath = Path.of(directoryPath.getAbsolutePath() + "/processed");
        if (!Files.exists(processedDirectoryPath) || !Files.isDirectory(processedDirectoryPath)) {
            Files.createDirectory(processedDirectoryPath);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }

    FilenameFilter textFileFilter = (dir, name) -> {
        String lowercaseName = name.toLowerCase();
        if (lowercaseName.endsWith(".csv")) {
            return true;
        } else {
            return false;
        }
    };
    // List of all the csv files
    File filesList[] = directoryPath.listFiles(textFileFilter);
    System.out.println("List of the text files in the specified directory:");
    for(File file : filesList) {

        try {
            try (var br = new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16)){
                List<CsvLine> beans = new CsvToBeanBuilder(br)
                        ......
                        .build()
                        .parse();

            for (CsvLine item : beans) {

                .....
                if (isFound.isPresent()) {
                    .........
        }}

        } catch (Exception e){
            e.printStackTrace();
        }

        // Move here file into new subdirectory when file processing is finished
        // In my opinion, here is the error:
        // Path copied = Paths.get(file.getAbsolutePath() + "/processed");
        Path originalPath = file.toPath();
        try {
            // Note the use of the path we defined before
            Files.move(originalPath, processedDirectoryPath.resolve(originalPath.getFileName()),
                StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

And pass a different folder to every different job:

JobDetail job1 = ...;
job1.getJobDataMap().put("DIRECTORY_PATH_NAME", "C:\\csv\\nov");

Please, consider refactor your code and define methods for file processing, file backup, etc, it will make your code easy to understand and handle.

jccampanero
  • 50,989
  • 3
  • 20
  • 49
  • I have one more issue. I want to use 5 Quartz jobs. But I get again `The process cannot access the file because it is being used by another process` because I have a race condition when Quartz jobs try to access the same file into the same directory. Do you know how this can be solved? – Peter Penzov Nov 20 '22 at 16:33
  • What is the purpose of launching several jobs, to increase parallelism? Is there any criteria you can use to split the filenames in batches, say certain pattern in the name, a month name, for instance? If that is the case, a simple solution would be to use the provided `JobExecutionContext` of every job to include some split criteria. That criteria will be used in your `FilenameFilter` causing every job to process only a certain portion of the whole amount of files that need to be processed. I think the solution is preferable to any kind of locking or similar mechanism. – jccampanero Nov 20 '22 at 17:27
  • I want to import more files. I can change the names. Can you show me this solution please? – Peter Penzov Nov 20 '22 at 17:40
  • I updated the answer @PeterPenzov, I hope it helps. – jccampanero Nov 20 '22 at 18:57
  • in progress. I will update you tomorrow. – Peter Penzov Nov 21 '22 at 20:33
1

The line in error message

Caused by: java.lang.RuntimeException: java.nio.file.FileSystemException: C:\csv\nov\07_06_26.csv -> C:\csv\nov\07_06_26.csv\processed: The process cannot access the file because it is being used by another process

I think you want to move the file from C:\csv\nov to C:\csv\nov\processed, so you have to change following line:

Path copied = Paths.get(file.getAbsolutePath() + "/processed");

to

 Path copied = Paths.get(file.getParent() + "/processed");

because file.getAbsolutePath() returns the complete path, include the name of file.

Reporter
  • 3,897
  • 5
  • 33
  • 47
1

Assuming we have File file = new File("c:/test.txt"), and print the the following paths:

Path copied = Paths.get(file.getAbsolutePath() + "/processed");
Path originalPath = file.toPath();

We will get the result:

copied: C:\test.txt\processed
originalPath: C:\test.txt

So its incorrect. You should try to get the parent path plus the processed folder plus the file name.

Path copied = Paths.get(file.getParentFile().getAbsolutePath() + "/processed/" + file.getName());
Path originalPath = file.toPath();
Melron
  • 569
  • 4
  • 10
0

I’m pretty sure that the file is being locked by the file reader that you create but never close in the following line:

List<CsvLine> beans = new CsvToBeanBuilder(new FileReader(file.getAbsolutePath(), StandardCharsets.UTF_16))

Refactor your code so that you have that reader in a try finally block or close it explicitly.

The unintuitive behavior you might see is that those files are released at seemingly random times. This is because when the garbage collector frees up those readers, they will then release the files. Clean them up explicitly instead.

Luke Machowski
  • 3,983
  • 2
  • 31
  • 28