1

I'm tying to write delta formatted table to ADLS Gen2 from Azure Synapse Pyspark notebook with Serverless SQL pool.

It is throwing me error, while writing to ADLS Gen2 as below. enter image description here

Py4JJavaError: An error occurred while calling o3898.save.
: Operation failed: "An HTTP header that's mandatory for this request is not specified.", 400, PUT,

Below is the code which I am using to write to ADLS.

if (DeltaTable.isDeltaTable(spark, source_path)):
    print('Existing delta table')
    # Read the existing Delta Table
    delta_table = DeltaTable.forPath(spark, source_path)
 
    # Merge new data into existing table
    delta_table.alias("existing").merge(
        source = df_eventLog.alias("updates"),
        condition = " AND ".join(conditions_list)
         
    ).whenMatchedUpdateAll(
    ).whenNotMatchedInsertAll(
    ).execute() 
else:
    print('New delta table')
    # Create new delta table with new data
    df_eventLog.write.format('delta').save(source_path)

In my case, Delta table isn't available initially, So, else part is running. df_eventLog is loading fine without errors.

Can someone help me where I am going wrong?

subro
  • 1,167
  • 4
  • 20
  • 32

1 Answers1

1

I tried to replicate the same issue with a delta table in my lake database. I'm trying to write it to ADLS storage account and loaded the delta table into data frame using below code:

df_delta = spark.read.format("delta").table("<database>.<tableNmae>") 

When I am writing the data frame into ADLS using below code, I got the same error as can be seen in the screenshot below:

delta_table_path = "abfss://<containerName>@<ADLSName>.blob.core.windows.net/"df_delta.write.format("delta").mode("overwrite").save(delta_table_path)

enter image description here

As per this MS Document to address the ADLS storage account The URL should be in abfs[s]://<file_system>@<account_name>.dfs.core.windows.net/<path>/<file_name> format. According to that I have modified the delta_table_path as mentioned below:

delta_table_path = "abfss://<containerName>@<ADLSName>.dfs.core.windows.net/<filepath>"

I tried again to write the data with above URL format. It wrote to the specified path successfully without any error.

enter image description here

enter image description here

Bhavani
  • 1,725
  • 1
  • 3
  • 6
  • Thanks for the comment, But If you see the error message, I am writing delta table to a path, i.e. framework/controlschema/DeltaTables/EventLog/. Here framework is a container name. I still see the error coming up! – subro Jul 25 '23 at 23:40
  • Thanks for the comment, I think I find the issue and it is working. But as I observed, you are using blob.core.windows.net initially, and later it is changed to dfs.core.windows.net. can you please let me know what is the differences between blob.core.windows and dfs.core.windows? – subro Jul 26 '23 at 00:04
  • blob.core.windows.net is used for blob storage and dfs.core.windows.net is used for ADLS storage you can check [this](https://learn.microsoft.com/en-us/answers/questions/124197/writing-parquet-file-throws-an-http-header-thats-m) for more information. – Bhavani Jul 26 '23 at 03:47