-2

I'm currently working on a search output system that searches a directory for a specific phrase in a file, matches it, then outputs it to a log file. I have a problem snippet of code that looks like this:

int j = 0;
for(String currentMatch : lineMatch) {

                    String[] split = fileList.get(j).toString().split("\\\\");
                    match.write(split[3] + " : " + currentMatch + "\r\n");
                    match.flush();
                    j++;
}

With fileList being an arraylist of the file names with a matching result and filePath being an arraylist of the file path. I used the split[3] to return the name of the the forth folder in this directory that I'm interested in.

The output file then becomes a little funky. This directory in question has roughly 40 unique names, but the log ends up looking like this:

    dir1 :   matchingline
    dir2 :   matchingline
    dir3 :   matchingline
    dir3 :   matchingline
    ...     (x543)
    dir4 :   matchingline

And so on. Directory 3 is only supposed to have 88 matching lines and ends up with an additional 455 lines that belong to other directories. Any idea on why this happens? Is it because I'm using an assignment in the middle of a PrintWriter, or am I missing something simple here?

Edit: Variables listed for clarity.

match = Printwriter object used to print to an output.

lineMatch = ArrayList() - contains the directory path of the current matched file

fileMatch = ArrayList() - contains the file name that was matched.

split[3] is used because the matched files are consistently found in the 4th directory in, ex. C:\User\Programs\Programname\

/r/n is used to keep formatting on windows.

This is a personal project, so I'm not too concerned with making it portable.

Edited to add the method used for initializing the arraylist.

public static void addFiles(String dirPath) {

        File dir = new File(dirPath);
        File[] files = dir.listFiles();


        try {
            if(files.length == 0) {

                emptyFilePath.add(dirPath);

            }
            else {
                for (File currentFile : files) {
                    if(currentFile.isFile()) {

                        fileList.add(currentFile);
                        filePath.add(currentFile.getPath());
                    }
                    else if (currentFile.isDirectory()) {

                        addFiles(currentFile.getAbsolutePath());

                    }

                }
            }

        }catch(Exception e) {
            e.printStackTrace();
        }
    }

And the code that generates lineMatch:

while(i < fileList.size()) {

                File files = new File(filePath.get(i));
                Scanner file = new Scanner(files);
                try {

                    while(file.hasNextLine()) {
                        String currentLine = file.nextLine();
                        if(currentLine.contains(searchString)) {
                            lineMatch.add(currentLine);
                        }

                    }

                }finally {
                    file.close();
                }

                i++;

            }
  • The edit still doesn't clarify enough. How is `lineMatch` generated? How is `fileMatch` generated? It was clear that they would be lists. But how are these lists initialized? – Christian Hujer Jul 23 '18 at 18:45
  • Posted above, basically just a recursive method to search a specified directory and all sub directories – Jonathan Buelow Jul 23 '18 at 18:49
  • Answer was different here - It's because while lineMatch was updated and cut down the number of strings to just those that matched, filePath was not and had reference places to ArrayList elements that had been skipped. – Jonathan Buelow Jul 23 '18 at 19:01

1 Answers1

0

There are a number of things that are suspicious about your code.

  • Are LineMatch and FileList variables? If so, then you should write them like variables, that is, lineMatch and fileList (lowerCamelCase). Doing otherwise confuses readers and syntax highlighters alike.
  • You use split[3], that looks suspicious.
  • If you are using split("\\\\") in order to get the directory path parts, beware that your code is non-portable, it will work on Windows only. If you want to split a path into its parts, it's better to use the API.
  • In order to understand the problem, it would be useful to see how LineMatch and FileList are generated, without that, it's not possible to understand what's going wrong in your code.
  • If match is a PrintWriter or PrintStream, you should use println() or format("...%n") instead of write(... + "\r\n"). Again, because your code is not portable. On Unix, line endings are \n only, not \r\n.

The actual problem is with your program logic. Your variable lineMatch contains the hits of all files found. Because you don't generate a separate lineMatch for each file, but just a single one for all files. At least that's how it looks like from the code that you've posted so far.

It looks like what you want to program is a simple version of grep (or, on DOS, find). Part of your logic is correct, for example, how you use recursion to descend in to the directory tree. Instead of collecting all matches and then printing, find and print the matches while you're traversing the directory tree.

In general, you will end up with less errors if you avoid global variables. You ran into a problem in the first place because your variables LineMatch and FileList are global variables. Avoid global variables, avoid reusing variables, and also avoid variable re-assignment.

Christian Hujer
  • 17,035
  • 5
  • 40
  • 47