0

I'm searching a directory of files with Java 8 and extracting music files. When I run my code on Linux (Debian Wheezy) it completes in around 20 seconds. However, when I run the identical code in Windows 8.1 (same machine!) it takes an inordinately long time, so long that it's really unusable. I've ascertained that the process is occurring as it should, just very slowly. In the time that the Linux variant finds all 2500 files, the Windows variant has found around 100.

Here is the code:

public int List(String path) throws InterruptedException, IOException {
    //Linux Variant
    if (HomeScreen.os.equals("Linux")) {
        File root = new File(path);
        File[] list = root.listFiles();
        Arrays.sort(list);
        if (list == null) {
            return 0;
        }

        for (File f : list) {
            if (f.isDirectory()) {
                List(f.getAbsolutePath());

            } else if (f.isFile()) {
                String outPath = f.getAbsolutePath();
                try {
                    String ext = outPath.substring(outPath.lastIndexOf(".") + 1);
                    if (ext.equals("wma") || ext.equals("m4a") || ext.equals("mp3")) {
                        String fulltrack = outPath.substring(outPath.lastIndexOf("Music/") + 6);
                        lm.addElement(fulltrack);
                        numbers++;
                    }
                } catch (Exception e) {
                    System.out.println(outPath + " is not a valid file!!!!!");
                }
                HomeScreen.Library.setModel(lm);

            }

        }
    //Windows variant
    } else if (HomeScreen.os.equals("Windows 8.1")){
        System.out.println("Using " + HomeScreen.os + " methods...");
        File root = new File(path);
        File[] list = root.listFiles();
        Arrays.sort(list);
        if (list == null) {
            return 0;
        }

        for (File f : list) {
            if (f.isDirectory()) {
                List(f.getAbsolutePath());

            } else if (f.isFile()) {
                String outPath = f.getAbsolutePath();
                try {
                    String ext = outPath.substring(outPath.lastIndexOf(".") + 1);
                    if (ext.equals("wma") || ext.equals("m4a") || ext.equals("mp3")) {
                        String fulltrack = outPath.substring(outPath.lastIndexOf("Music/") + 9);
                        lm.addElement(fulltrack);
                        numbers++;
                    }
                } catch (Exception e) {
                    System.out.println(outPath + " is not a valid file!!!!!");
                }
                HomeScreen.Library.setModel(lm);

            }

        }
    }
    return numbers;
}

I'm still pretty new to Java, so I'm not sure how to go about optimising the code for Windows. Is there any way this can be sped up, or are Windows users doomed to go for a coffee and wait for the load up?

Incidentally, I've put this method in a thread when using Windows so that other things can be done whilst waiting, but this is most definitely not an ideal solution. The drive being searched is a 7200 rpm HDD and there is 8GB RAM installed.

Luiggi Mendoza
  • 85,076
  • 16
  • 154
  • 332
Guy Stimpson
  • 87
  • 11
  • check the solution here http://stackoverflow.com/a/19520486/2273540 – Lorenzo Boccaccia Sep 11 '14 at 22:17
  • Thanks for the pointer. Some of those answers are quite jargon heavy, but I'll get my thinking cap on in the morning and have a go. Thanks :) – Guy Stimpson Sep 11 '14 at 22:48
  • Consider using [`Files.walkFileTree`](http://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#walkFileTree-java.nio.file.Path-java.nio.file.FileVisitor-) rather than a homebrew solution. – Boris the Spider Sep 11 '14 at 23:03

2 Answers2

0

Try the new Java 8 stream API, it allows you to do all of the actions (sort filter forEach) In one loop! AND in parallel:

here is your changed code (you might need to fix some parts i dident have like that HomeScreen)

        Arrays.asList(root.listFiles())
                                   .parallelStream()
                                   .filter(file -> file != null)
                                   .forEach(file -> {
                                       if (file.isDirectory())
                                       {
                                           List(file.getAbsolutePath());
                                       }
                                       else if (file.isFile())
                                       {
                                           String outPath = file.getAbsolutePath();
                                           try
                                           {
                                               String ext = outPath.substring(outPath.lastIndexOf(".") + 1);
                                               if (ext.equals("wma") || ext.equals("m4a") || ext.equals("mp3"))
                                               {
                                                   String fulltrack = outPath.substring(outPath.lastIndexOf("Music/") + 9);
                                                   lm.addElement(fulltrack);
                                                   numbers++;
                                               }
                                           } catch (Exception e)
                                           {
                                               System.out.println(outPath + " is not a valid file!!!!!");
                                           }
                                           HomeScreen.Library.setModel(lm);
                                       }
                                   });
David Limkys
  • 4,907
  • 5
  • 26
  • 38
  • Thanks David. I'm away from my screen now but will have a try tomorrow. Many thanks! – Guy Stimpson Sep 11 '14 at 22:49
  • File IO is not improved by parallellisation. In fact it can be slowed by it. Further this is an abuse of the `Stream` API - you should use [`Files.walkFileTree`](http://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#walkFileTree-java.nio.file.Path-java.nio.file.FileVisitor-). – Boris the Spider Sep 11 '14 at 23:02
  • @BoristheSpider I ran some benchmarks showing up to 3 times speedup in directory scanning with <=4 threads (and no slowdown with 10 threads). It may backfire, but the disks can serve multiple requests efficiently (and I mean off-the-shelf SATA, no high-end SCSI). – maaartinus Sep 11 '14 at 23:24
  • This isn't coming out any faster. Also, I now am unable to sort the array, although I imagine there is a swift workaround for this. – Guy Stimpson Sep 12 '14 at 01:09
  • @GuyStimpson no, there is not. If you do things in parallel then you cannot guarantee order. – Boris the Spider Sep 12 '14 at 07:01
  • @DavidLimkys I would assume that your benchmarks are flawed - benchmarking is a very tricky thing, especially IO as there are many caches that get filled with recently accessed data. – Boris the Spider Sep 12 '14 at 07:02
  • Would someone be kind enough to explain to me exactly how to implement `Files.walkFileTree`? I've read a number of posts / tutorials but am failing to get it working. I do have a workaround in mind but it's less than perfect (along the lines of saving the output of an initial file search to a text file, and then adding a 'scan for changes' button). – Guy Stimpson Sep 12 '14 at 08:04
0

As recommended in a comment to the question linked by Lorenzo Boccaccia, I'd go for newDirectoryStream. It returns the files one by one, which should be faster.

I'd also consider using multithreading. With a single thread, you wait for the disk nearly all the time. Modern disks are capable of handling multiple outstanding requests, so using 2-4 threads should help.


A side note: There's no reason to write the code differently for Linux and Windows. There may be minor changes needed, but they should be handled by some small helper method.

In no way write things like

if (HomeScreen.os.equals("Linux")) {
    ...
} else if (HomeScreen.os.equals("Windows 8.1")) {
    ...
}

What if it's "Windows 8.2"?

Community
  • 1
  • 1
maaartinus
  • 44,714
  • 32
  • 161
  • 320
  • Thanks very much for the answer. I'll have a try tomorrow. Yeah I agree about the `if(HomeScreen.os.equals("Windows 8.1")` part, just tryinng to get my head around the basic operations. In all honesty this part of the code WAS working without any changes for Windoze. Just needed to put something in to make sure the issue was OS specific: ) – Guy Stimpson Sep 11 '14 at 22:44
  • @GuyStimpson Sometimes it looks like a good idea to duplicate the code, but I always try hard to avoid it. Otherwise changes get harder and harder and merging two similar pieces of code together is a damn hard task. Use [version control](http://git-scm.com), commit often and don't be scared to rewrite or drop what doesn't look right. – maaartinus Sep 11 '14 at 23:30
  • I have taken your advice and removed the duplication :) Still nowhere with the speed-up though :/ – Guy Stimpson Sep 12 '14 at 01:31
  • @GuyStimpson That's sad. Windows is a crap, but it shouldn't be that bad. There's `Files.walkFileTree` in Java 7, it's newer and it's easy to use. It could be faster or not, the slow part is always the disk access, but maybe something's got improved in interfacing the OS. Multithreading could help or not, and it's harder to do... – maaartinus Sep 12 '14 at 01:55
  • I didn't mention, which I probably should have, that the drive being read is a network drive which appears to have a linux-like OS (at least it does when I ssh to it). I fear that either this, or maybe the file syatem in use is what's giving Windows a headache. – Guy Stimpson Sep 12 '14 at 02:01