0

I'm processing files and directories looking for the most recent modified file in each directory. The code I have works but, being new to Ruby, I'm having trouble handling errors correctly.

I use Find.find to get a recursive directory listing, calling my own function newestFile for each directory:

Find.find(ARGV[0]) { |f| 
  if File.directory?(f)
    newestFile(f)
  end
}

In the directory tree there are folders I do not have permission to access, so I want to ignore them and go on to the next, but I cannot see how to incorporate the exception handling in to the Find.find "loop".

I tried to put begin..rescue..end around the block but that does not allow me to continue processing the loop.

I also found this SO question: How to continue processing a block in Ruby after an exception? but that handles the error in the loop. I'm trying to recover from an errors occurring in Find.find which would be outside the exception block.

Here's the stack trace of the error:

PS D:\dev\ruby\> ruby .\findrecent.rb "M:/main/*"
C:/Ruby200/lib/ruby/2.0.0/find.rb:51:in `open': Invalid argument - M:/main/<A FOLDER I CAN'T ACCESS> (Errno::EINVAL)
        from C:/Ruby200/lib/ruby/2.0.0/find.rb:51:in `entries'
        from C:/Ruby200/lib/ruby/2.0.0/find.rb:51:in `block in find'
        from C:/Ruby200/lib/ruby/2.0.0/find.rb:42:in `catch'
        from C:/Ruby200/lib/ruby/2.0.0/find.rb:42:in `find'
        from ./findrecent.rb:17:in `<main>'

How do I add exception handling to this code?

I had a look in the code where the exception is being generated and the method contains the following block:

if s.directory? then
  begin
    fs = Dir.entries(file)
  rescue Errno::ENOENT, Errno::EACCES, Errno::ENOTDIR, Errno::ELOOP, Errno::ENAMETOOLONG
    next
  end
  ... more code

Performing a horrible hack I added Errno::EINVAL to the list of rescue errors. My code now executes and goes through all the folders but I can't leave that change in the Ruby library code.

Internally find is using Dir.entries, so maybe I need to rewrite my code to process the folders myself, and not rely on find.

I would still like to know if there is a way of handling errors in this sort of code construct as from reading other code this type of small/concise code is used a lot in Ruby.

Community
  • 1
  • 1
Tony
  • 9,672
  • 3
  • 47
  • 75

2 Answers2

2

Do you get this error on your newestFile function or when you try to run File#directory??

If this happens in newestFile you can do something like this:

Find.find(ARGV[0]) do |f| 
  if File.directory?(f)
    newestFile(f) rescue nil
  end
end

This just ignores any errors and punts until the next folder. You could also do some nicer output if desired:

Find.find(ARGV[0]) do |f| 
  if File.directory?(f)
    begin
      newestFile(f)
    rescue
      puts "error accessing: #{f}, you might now have permissions"
    end
  end
end

If the error happens in the File#directory? you need to wrap that section as well:

Find.find(ARGV[0]) do |f|
  begin
    if File.directory?(f)
      newestFile(f)
    end
  rescue
    puts "error accessing: #{f}, you might now have permissions"
  end
end

Like you mentioned if the error is occurring in the Find#find itself then you can't catch that from the block. It would have to happen inside of that method.

Can you confirm that the exception is happening in that method and not the subsequent ones by pasting a stack trace of the exception?

Edit

I was going to suggest traversing the directories yourself with something like Dir#entries so you would have that capacity to catch the errors then. One thing I am interested in is if you leave of the * in the call from the command line. I am on MacOS so I can't duplicate 100% what you are seeing but If I allow it to traverse a directory that I don't have access to on my mac it prints debug info about what folders I can't access but continues on. If I give it the * on the other had it seems to do nothing except print the error of the first folder it can't access.

One difference in my experience on the MacOS is that it isn't actually throwing the exception, it is just printing that debug info to the console. But it was interesting that the inclusion of the * made mine stop completely if I didn't have access to a folder.

Ben
  • 9,725
  • 6
  • 23
  • 28
  • I've updated my question with the stack trace. Unfortunately the error is originating in the `Find.find` code. – Tony Nov 07 '14 at 16:33
  • Thanks for your suggestions. Your comment "It would have to happen inside of that method." made me think of investigating further, so I had a look in the `find` code and I've added more to my question. – Tony Nov 07 '14 at 16:54
  • Ruby's `Find` is specially written for just the purpose the OP is using it for, to quickly traverse a directory hierarchy and selectively process files or directories based on criteria. `Dir.entries` is more generic and *can* be used in similar ways, but will result in more code. – the Tin Man Nov 07 '14 at 18:29
1

You can be reactive or proactive, either works, but by testing in advance, your code will run a little faster since you won't be triggering the exception mechanism.

Instead of waiting for a problem to happen then trying to handle the exception, you can find out whether you actually should try to change to a directory or access a file using the File class's owned? and grpowned? methods. From the File documentation:

grpowned?(file_name) → true or false

Returns true if the named file exists and the effective group id of the calling process is the owner of the file. Returns false on Windows.

owned?(file_name) → true or false

Returns true if the named file exists and the effective used id of the calling process is the owner of the file.

That means your code can look like:

Find.find(ARGV[0]) do |f| 
  if File.directory?(f) && %w[grpowned? owned?].any?{ |m| File.send(m.to_s, f) }
    newestFile(f)
  end
end

Ruby will check to see if the directory entry is a directory and whether it is owned or grpowned by the current process. Because && is short-circuiting, if it's not a directory the second set of tests won't be triggered.

On some systems the group permissions will give you a better chance of having access rights if there are lots of shared resources, so that gets tested first, and if it returns true, any? will return true and the code will progress. If false is returned because the group permissions don't allow access, then owned? will test the file and the code will skip or step into newestFile. Reverse those two tests for speed depending on the set-up of your system. Or, run the code one with using time ruby /path/to/your/code then twiddle the two and run it again. Compare the resulting times to know which is faster on your system.

There are different schools of thought about whether using exception handling to control program flow is good and different languages prefer different things in their programming styles. To me, it seems like code will always run faster and more safely if I know in advance whether I can do something, rather than try and have it blow up. If it blows up in an expected way, that's one thing, but if it blows up in ways I didn't expect, then I might not have exception handling in place to react correctly, or it might trigger other exceptions that mask the true cause. I'd rather see if I can work my way out of a situation by checking the state, and then if all my attempts failed, have an exception handler that lets me gracefully exit. YMMV.

Finally, in Ruby, we don't name methods using Camelcase, we use snake_case. snake_case_is_easier toReadThanCamelCase.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • I like this approach of checking ownership before doing something with it. – Bala Nov 07 '14 at 17:49
  • It's "kinder and gentler" I think. – the Tin Man Nov 07 '14 at 17:50
  • @theTinMan - Thanks for your answer; I also prefer to avoid exceptions as you describe but in this instance the exception is occurring _in_ the `Find.find` method, so I'm not able to check folder access in the block before it goes bang. There might be a Windows specific bug in the Ruby library as it recovers from `Errno::EACCES` [Permission denied] but I'm getting the exception Errno::EINVAL [Invalid argument]. I might try using the enumerator returned from `find` (and not pass a block) or use `Dir.entries` and, as you said in another comment, live with the code bloat. – Tony Nov 10 '14 at 16:40
  • Windows doesn't support many of the *nix file-level stat checks. Those are noted in the documentation. – the Tin Man Nov 10 '14 at 18:04