I am trying to use DBUtils and Pyspark from a jupyter notebook python script (running on Docker) to access an Azure Data Lake Blob. However, I can't seem to get dbutils to be recognized (i.e. NameError: name 'dbutils' is not defined). I've tried explicitly importing DBUtils, as well as not importing it as I read:
"An important point to remember is to never run import dbutils in your Python script. This command succeeds but clobbers all the commands so nothing works. It is imported by default." Link
I've also tried the solution posted here, but it still threw "KeyError: 'dbutils'"
spark.conf.set('fs.azure.account.key.<storage account>.blob.core.windows.net', <storage account access key>)
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true")
dbutils.fs.ls("abfss://<container>@<storage account>.dfs.core.windows.net/")
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "false")
Does anyone have a solution to this?