I have a "CSV" file that uses a multi-character delimiter, so the data looks something like
field1^|^,^|^field2^|^,^|^field3^|^,^|^field4
Following code in a notebook inside Databricks
throws the error (shown below) on the second line where it tries to write dataframe df
to a destination table:
df = spark.read.csv(".../Test/MyFile.csv", sep="^|^,^|^", header="true", inferSchema="true")
df.write.(.....)
Error:
java.sql.SQLException: Spark Dataframe and SQL Server table have differing numbers of columns
Both the csv files and the database tables have exact same column names and have exact number of columns. The error seems to indicate that the delimiter ^|^,^|^
is not being recognized. Is there a way to parse this data file with this ^|^,^|^ delimiter?
UPDATE: My bad, I forgot to include sep
attribute in the spark.read.csv(...)
call above. I have added that attribute now, and its value ^|^,^|^
. But the error is still the same.