Azure Synapse Analytics - column load with high length

Question

I have a column which contains description and has length around 80000(contains unicode characters). I am unable to load it to Azure Synapse Analytics. Datatype defined in Synapse is nvarchar. We are using Azure Databricks. I tried options with maxStrLength (.option('maxStrLength', 4000)), but the max value for this is 4000. So, I am getting this error.

: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector. Underlying SQLException(s):

com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: String or binary data would be truncated. [ErrorCode = 107090] [SQLState = S0001]

Could you please let me know if there any way to solve this issue?

Update - The issue is resolved by removing .option('maxStrLength', 4000) while writing the dataframe into Synapse and declaring the target datatype as nvarchar(max)

What code are you running when you get this error? Please post an anonymised version. Having experienced similar errors when using external tables in Synapse dedicated SQL pools, the workaround was to create a table with NVARCHAR(MAX) datatype and _not_ used clustered columnstore index. — wBob, Jan 04 '22 at 11:18

score -1 · Accepted Answer · answered Jan 06 '22 at 07:47

You can store up to 1 billion 2-byte Unicode characters by using nvarchar [ ( n | max ) ].

Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying.

You can refer the related SO threads mentioned below:

How does SQL Server store more than 4000 characters in NVARCHAR(max)?

SQL query variable nvarchar(max) can not store more than 4000 characters

Azure Synapse Analytics - column load with high length

1 Answers1