0

I'm making use of the Hadoop FileStatus API to determine whether a folder is empty in order for the directory to be deleted.

To to determine if I have an empty directory on s3n, What I do is check for FileStatus[] length, if it equals to zero, then I request the Hadoop Filesystem to perform a delete via fs.delete(path, false). False here represents a non recursive delete request.

For FTP and HDFS, the files and then the empty directories containing these files are deleted as expected. But for S3n, the empty directories remain. I'm not sure why that is.

I have local unit/integration tests that use an in memory S3 filesystem, and here the delete works as expected. However when running the code against a real S3n filesystem, it fails (empty directories not deleted but files are).

Any suggestions or pointers would be much appreciated. thank you.

user983022
  • 979
  • 1
  • 18
  • 30
  • A couple of questions that might help others trying to help you: Do you get any exceptions or error messages? Do you have versioning enabled? – Viccari Jul 09 '12 at 10:58
  • No, no exceptions at all, no messages. What do you mean by Versioning? – user983022 Jul 09 '12 at 11:56
  • I meant: "Does your S3 bucket have versioning enabled?" – Viccari Jul 09 '12 at 12:21
  • No as far as I know it does not. However I've discovered that the recursive delete does work if the directories contain files. The directories that become empty due to the file deletion are deleted as expected. However if I attempt to delete empty directories from the outset then the directories remain. – user983022 Jul 09 '12 at 19:53
  • Well, S3 is a "flat" file system, and does not have the concept of folders. It might be possible that the length is zero simply because there is no file with that name. What you are calling a "directory" is in fact simply a file prefix in S3. Have a look at [this answer](http://stackoverflow.com/questions/9329234/amazon-aws-ios-sdk-how-to-list-all-file-names-in-a-folder/9330600#9330600), as it might help clarifying. – Viccari Jul 09 '12 at 20:01
  • thanks for your comments. it helped solve the problem. – user983022 Jul 12 '12 at 08:20
  • NP. I have added an answer, so if you really think it helped, you can accept, and others with similar problems might benefit of it. Thanks. – Viccari Jul 12 '12 at 11:29

1 Answers1

0

Since you did not see any exceptions or error messages, and your bucket does not seem to have versioning enabled, here is what you should try:

S3 is a "flat" file system, and does not have the concept of folders. It might be possible that the length is zero simply because there is no file with that name. What you are calling a "directory" is in fact simply a file prefix in S3. Have a look at this answer, as it might help clarifying.

Community
  • 1
  • 1
Viccari
  • 9,029
  • 4
  • 43
  • 77