I am using DSN to connect my local python to an HDFS cluster:
import pyodbc
with pyodbc.connect("DSN=CDH_HIVE_PROD", autocommit=True) as conn:
df = pd.read_sql("""Select * from table1""", conn)
df
how do I write this table back to the cluster as 'table1tmp'? Do I need a create statement to create the table first? And then how do I insert data from a pandas dataframe?
I assume this is something done frequently enough to where it should be fairly easy (pull data, do something, save data back), but am not able to find any examples that use pyodbc, or DSN, which seems to be my only way of connecting.