0

My linux server has a directory that contains many other sub-directories, that contains files named with keywords. For example:

Dir1:
 -Dir1.1:
   -file-keyword11-keyword7-keyword9.txt
   -file-keyword2-keyword7-keyword97.txt
 -Dir1.2:
   -file-keyword4-keyword6-keyword9.txt
   -file-keyword2-keyword8-keyword3.txt
Dir2:
 -Dir2.1:
   -file-keyword5-keyword42-keyword2.txt
   -file-keyword8-keyword11-keyword9.txt

I need a to create a method that returns a list of all files containing one of two keywords. For example:

findFiles("keyword11", "keyword42");

Should return the following files of the previous example:

-file-keyword11-keyword7-keyword9.txt
-file-keyword5-keyword42-keyword2.txt
-file-keyword8-keyword11-keyword9.txt

I am thinking about creating a recursive method that tests if the name of each file contains one of the two keywords. But I am afraid about performance, because the directories have thousands of files and sub-directories. And there will be more and more files that will be created every day.

I would like to know what would be the right way to do it. Should I use file.getName().contains() method? Should I use regex? Or should I use a linux command like grep?

viniciussss
  • 4,404
  • 2
  • 25
  • 42
  • Are you able to exec Linux command in your code? If so how about use "find"? – Top.Deck Jul 14 '17 at 18:50
  • @Top.Deck Yes, I can exec Linux command, but I am not used to it. I will search about the find command. – viniciussss Jul 14 '17 at 18:56
  • 1
    find . -type f \( -name "*key1*" -o -name "*key2*" \) – Top.Deck Jul 14 '17 at 18:58
  • 1
    @viniciussss If you do end up using `find`, try the command `find * -regextype 'posix-extended' -regex ".*keyword1.*|.*keyword2.*"`. You may have to construct the pattern as a string beforehand though. – kirkpatt Jul 14 '17 at 18:58
  • How fast should that finding be? Because in java the finding of files is already pretty fast. Also you can use some recursion to get all files. And you can use .stream() on lists too...so you can pretty fast filter anything. – Garamaru Jul 14 '17 at 19:09
  • @Garamaru The users will have to wait the process after signing up. The screen will freeze until the finding is done. So the faster the better. Less than one second would be great. To be honest, I didn't test anything yet, because the directories do not have many files now. I want to make it in the right way now just to make sure that I will not have problems when there will be a lot of files in the directories. – viniciussss Jul 14 '17 at 19:47
  • So what would hinder you in creating tons of files to test your expected scan? Do a loop and create as many needed files as needed with FileWriter(see https://stackoverflow.com/questions/30073980/java-writing-strings-to-a-csv-file). Then you can do your own tests what suggestion works good enough for you. – Garamaru Jul 15 '17 at 14:06
  • @Garamaru I will do it. thanks for the suggestion. – viniciussss Jul 15 '17 at 14:18

1 Answers1

2

You can use FileVisitor, it's very convenient.

Here is the example:

public class FileVisitorTest {

    @Test
    public void test() throws Exception {
        String path = "D:\\downloads\\";
        findFiles(path,"apache", "Test");
    }

    public void findFiles(String path, String... keyWords) {
        try {
            Files.walkFileTree(Paths.get(path), new FileVisitor<Path>() {
                @Override
                public FileVisitResult preVisitDirectory(Path path, BasicFileAttributes fileAttributes) {
                    return FileVisitResult.CONTINUE;
                }

                public FileVisitResult visitFile(Path path, BasicFileAttributes fileAttributes) {
                    for (String keyWord : keyWords) {
                        if (path.getFileName() != null && path.getFileName().toFile().getName().contains(keyWord))
                            System.out.println("File name:" + path.getFileName());
                    }

                    return FileVisitResult.CONTINUE;
                }

                @Override
                public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
                    return FileVisitResult.CONTINUE;
                }

                @Override
                public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
                    return FileVisitResult.CONTINUE;
                }
            });
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

If you want to do smth with directories, use preVisitDirectory and postVisitDirectory methods to do smth before and after you visit a directory.

DontPanic
  • 1,327
  • 1
  • 13
  • 20