How to save and download locally csv in DBFS?

Asked Oct 08 '19 at 14:10

Active Oct 23 '19 at 07:36

Viewed 3,132 times

I'm trying to save csv file as a result of SQL query, sent to Athena via Databricks. The file is supposed to be a big table of about 4-6 GB (~40m rows).

I'm doing the next steps:

Creating PySpark dataframe by:

df = sqlContext.sql("select * from my_table where year = 19")

Converting PySpark dataframe to Pandas dataframe. I realize, this step may be unnecessary, but I only start using Databricks and may not know the required commands to do it more swiftly. So I do it like this:
```
ab = df.toPandas()
```
Save the file somewhere to download it locally later:
```
ab.to_csv('my_my.csv')
```

But how do I download it?

I kindly ask you to be very specific as I do not know many tricks and details in working with Databricks.

edited Oct 08 '19 at 15:31

James Z

12,209
10
24
44

asked Oct 08 '19 at 14:10

Dmytro Zelenyi

1

Do you want to download the file to DBFS or to the local machine? – CHEEKATLAPRADEEP Oct 21 '19 at 09:34
Eventually, I would like having the file locally on my machine. – Dmytro Zelenyi Oct 22 '19 at 13:02

How to save and download locally csv in DBFS?

0 Answers0

Linked