I'm trying to do something fairly simple, but either odo
is broken or I don't understand how datashapes work in the context of this package.
The CSV file:
email,dob
tony@gmail.com,1982-07-13
blah@haha.com,1997-01-01
...
The code:
from odo import odo
import pandas as pd
df = pd.read_csv("...")
connection_str = "postgresql+psycopg2:// ... "
t = odo('path/to/data.csv', connection_str, dshape='var * {email: string, dob: datetime}')
The error:
AssertionError: datashape must be Record type, got 0 * {email: string, dob: datetime}
It's the same error if I try to go directly from a DataFrame -> Postgres as well:
t = odo(df, connection_str, dshape='var * {email: string, dob: datetime}')
A few other things that don't fix the problem: 1) removing the header line from the CSV file, 2) changing var
to the actual number of rows in the DataFrame.
What am I doing wrong here?