I'm trying to use Spark to pull data from a Hive table and save it in a SQL Server table. An issue I am facing is that some columns are being pulled into the Dataframe with the BYTE
datatype. I would like these to be pulled as TINYINT
or INT
if TINYINT
is not possible.
The basic way I am doing it is this:
query = [SQL query]
val df = sql(query)
df.write.jdbc([connection info])
How can I apply a schema to this process that forces certain data types?