0

I'm trying to load data data from Azure Databricks into Azure SQL database table via jdbc. Data is loading fine but the double values in Azure Databricks when inserted in to SQL table is getting converted to exponential values. I have tried different datatypes in SQL database such as nvarchar, varchar, float and it gets converted to exponential values.

However, when i try using the decimal data type in Azure SQL database it loads data into the column without exponential values but giving me extra zeros in the end.

The command that i'm using in Databricks is:

%scala

spark.sql("select ID from customers")
     .write
     .mode(SaveMode.Append) // <--- Append to the existing table
     .jdbc(jdbcUrl,stg.customers, connectionProperties)

Some of the examples stored in Azure Databricks ID column are:

ID
1900845009567889.12
2134012183812321

When using numeric(38,15) data type in Azure SQL Database it is giving me the following output:

|ID|
|:--|
|1900845009567889.1200000000000000|
|2134012183812321.0000000000000000|

I don't want the extra zeros in the end. Also, data in the Databricks table is not properly defined so i cannot say if numeric(38,15) would suffice or not.

I also tried storing the data in Azure Databricks ID column as String datatype and then tried loading that into a varchar or nvarchar datatype in SQL table. But, still it is converting the data into exponential values.

Can anyone please suggest if there is any easy way to load this data from Azure Databricks to Azure SQL database?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Kshitij
  • 47
  • 6

1 Answers1

0

I cannot say if numeric (38,15) would suffice or not

Before SQL Server 2016, the conversion of numeric is restricted up to the precision of 17 digits only.

Now there is no restriction after this version of SQL Server 2016 (13.x).

Generic Example:

Below is the simplified example from Microsoft Documentation which addresses the numeric data types of values how it will inserted into the respective columns.

CREATE  TABLE dbo.Table ( DecimalColumn DECIMAL(5,2) ,NumericColumn NUMERIC(10,5) ); 

GO 

INSERT  INTO dbo.Table VALUES (123, 12345.12); GO SELECT DecimalColumn, NumericColumn FROM dbo.Table;

Result for the above SQL Query:

enter image description here

I don't want the extra zeros in the end.

In SQL Server we can use the float data type to exclude the extra zeros from the end, i.e., (scale).

For this we have to cast it to a float type in order to display purpose only.

SELECT CAST(12345.1200000 as float)

Output:

12345.12

Reference for the same above to Exclude the extra zeros.

RajkumarPalnati
  • 541
  • 2
  • 6
  • Thanks for your response. I was hoping not to use cast and directly use the data in my SQL table in the format that is stored in Azure Databricks tables. Was wondering if you have faced a similar thing in the past. Even if i store the values as a "String" in Databricks and the SQL column data type "varchar/nvarchar", it is still converting the data to exponential format in SQL. Wanted to understand where i can prevent that from happening. – Kshitij Jun 22 '22 at 04:01