27

Can someone let me know how to use the databricks dbutils to delete all files from a folder. I have tried the following but unfortunately, Databricks doesn't support wildcards.

dbutils.fs.rm('adl://azurelake.azuredatalakestore.net/landing/stageone/*')

Thanks

Climbs_lika_Spyder
  • 6,004
  • 3
  • 39
  • 53
Carltonp
  • 1,166
  • 5
  • 19
  • 39

6 Answers6

36

According to the documentation, the rm function receives 2 parameters :

rm(dir: String, recurse: boolean = false): boolean -> Removes a file or directory

Where the second parameter is a boolean flag to set the recursitivity, so you just need to set it to true:

dbutils.fs.rm('adl://azurelake.azuredatalakestore.net/landing/stageone/',True)
jegordon
  • 1,157
  • 3
  • 14
  • 17
  • 5
    This deletes the directory as well. How to delete all files *without* deleting the directory? – Samuel Mar 26 '19 at 10:54
  • 1
    @jegordon your answer is wrong the question is to remove the files not the folder itself. – Paul Velthuis Jun 05 '19 at 06:21
  • 1
    wrong anser, this removes the folder as well, not only the content of the folder. (in this case the special folder permission are also deleted) – gszecsenyi Mar 17 '20 at 17:04
  • dbutils.fs.rm("adl://azurelake.azuredatalakestore.net/landing/stageone/", **true**) – jack Oct 05 '21 at 16:02
5

Something like this should work:

val PATH = "adl://azurelake.azuredatalakestore.net/landing/stageone/"
dbutils.fs.ls(PATH)
            .map(_.name)
            .foreach((file: String) => dbutils.fs.rm(PATH + file, true))
celezar
  • 442
  • 3
  • 9
2

To sum up all the above answers:

  1. to delete the whole folder, use:

    PATH = "adl://azurelake.azuredatalakestore.net/landing/stageone/"
    
     dbutils.fs.rm(PATH,True)
    

2) to delete all the files and subfolders, except this folder itself, use:

   PATH = "adl://azurelake.azuredatalakestore.net/landing/stageone/"
   for i in dbutils.fs.ls(PATH):
        dbutils.fs.rm(i[0],True)

I tested by myself, and it works!

Let me know if you have any concern.

--Shizheng

shizzhan
  • 41
  • 2
1

Enhance to ezraorich's answer, if you want to delete folders and files inside directory then use

PATH = "dbfs/azure/directory/sub_dirctory"

for i in dbutils.fs.ls(PATH):
    dbutils.fs.rm(i[0],True)
Davide Briscese
  • 1,161
  • 8
  • 18
1

For python users you could do something like:

folder_path = 'adl://azurelake.azuredatalakestore.net/landing/stageone/'
path_list = [fileinfo.path for fileinfo in dbutils.fs.ls(folder_path)]
for path in path_list:
    dbutils.fs.rm(path)
0

This worked for me:

PATH = "adl://azurelake.azuredatalakestore.net/landing/stageone/"

for i in dbutils.fs.ls(PATH):
    dbutils.fs.rm(i[0])