1

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a FileNotFoundError, but I'm able to read the same file as Spark RDD using SparkContext.

Please find the code below:

with open("/FileStore/tables/boringwords.txt", "r") as f_read:
  for line in f_read:
    print(line)

This gives me the error:

FileNotFoundError                         Traceback (most recent call last)
<command-2618449717515592> in <module>
----> 1 with open("dbfs:/FileStore/tables/boringwords.txt", "r") as f_read:
      2   for line in f_read:
      3     print(line)

FileNotFoundError: [Errno 2] No such file or directory: 'dbfs:/FileStore/tables/boringwords.txt'

Where as, I have no problem reading the file using SparkContext:

boring_words = sc.textFile("/FileStore/tables/boringwords.txt")
set(i.strip() for i in boring_words.collect())

And as expected, I get the result for the above block of code:

Out[4]: {'mad',
 'mobile',
 'filename',
 'circle',
 'cookies',
 'immigration',
 'anticipated',
 'editorials',
 'review'}

I was also referring to the DBFS documentation here to understand the Local File API's limitations but of no lead on the issue. Any help would be greatly appreciated. Thanks!

Riyaz Ali
  • 43
  • 4

3 Answers3

1

The problem is that you're using the open function that works only with local files, and doesn't know anything about DBFS, or other file systems. To get this working, you need to use DBFS local file API and append the /dbfs prefix to file path: /dbfs/FileStore/....:

with open("/dbfs/FileStore/tables/boringwords.txt", "r") as f_read:
  for line in f_read:
    print(line)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
0

Alternatively you can simply use the built-in csv method:

df = spark.read.csv("dbfs:/FileStore/tables/boringwords.txt")
Luiz Viola
  • 2,143
  • 1
  • 11
  • 30
0

Alternatively we can use dbutils

files = dbutils.fs.ls('/FileStore/tables/')
li = []
for fi in files: 
  print(fi.path)

Example ,

enter image description here