Modify data frame name when writing (as .csv) to a Blob Storage using Azure Databricks

Question

I created a cluster in Azure databricks. On its DBFS(Databricks File System) I've mounted an Azure Blob Storage(container). In a notebook I read and transform data(usign PySpark), and after all this process I want to write back the transformed dataset to the Azure Blob storage. When I do so, I do it with the following command line

model_data.write.mode("overwrite").format("com.databricks.spark.csv").options(header = "True", delimiter = ",").csv("/mnt/flights/model_data.csv")

Also tried

model_data.coalesce(1).write.mode("overwrite").format("com.databricks.spark.csv").options(header = "True", delimiter = ",").save("/mnt/flights/model_data.csv")

but I couldn't get the result I wanted, which was to write the model_data dataframe as model_data.csv in the container I mounted previously.

The result is always

This picture is how the container looks like in the azure blob storage.

A file with a pseudorandom name like "part-xxxxxxxxxx.csv" is created.

Thanks!

This question might be duplicate to https://stackoverflow.com/questions/31674530/write-single-csv-file-using-spark-csv Could you pls. check content of the part-C file? — Hauke Mallow, Nov 11 '18 at 16:46
Effectively, the part-xxxx.csv file contains the data I expected. That was your question about the checking part? — FelipePerezR, Nov 11 '18 at 21:46
Yes, that is the way it works. Pls. check the link for further explanation. — Hauke Mallow, Nov 11 '18 at 22:21
@FelipePerezR Can I get the code you used to write data to Azure blob storage from Azure Databricks. I've been having a lot of Job aborted errors trying to do so. I would also like to see the code you used to mount the blob storage to Databricks incase that's where the problem is from because I've used both options for write that you provided in your question to no avail. — Hafiz Adewuyi, Feb 08 '19 at 06:44
please check if works for you https://stackoverflow.com/a/75579796/680074 — Tayyab Vohra, Feb 27 '23 at 11:23

Modify data frame name when writing (as .csv) to a Blob Storage using Azure Databricks

0 Answers0