I have the following pandas dataframe:
import pandas as pd
df = pd.DataFrame({"id": [1,2,3], "items": [('a', 'b'), ('a', 'b', 'c'), tuple('d')]}
>print(df)
id items
0 1 (a, b)
1 2 (a, b, c)
2 3 (d,)
After registering my GCP/BQ credentials in the normal way...
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path_to_my_creds.json"
... I try to export it to a BQ table:
import pandas_gbq
pandas_gbq.to_gbq(df, "my_table_name", if_exists="replace")
but I keep getting the following error:
Traceback (most recent call last):
File "<string>", line 4, in <module>
File "/Users/max.epstein/opt/anaconda3/envs/rec2env/lib/python3.7/site-packages/pandas_gbq/gbq.py", line 1205, in to_gbq
...
File "/Users/max.epstein/opt/anaconda3/envs/rec2env/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 342, in bq_to_arrow_array
return pyarrow.Array.from_pandas(series, type=arrow_type)
File "pyarrow/array.pxi", line 915, in pyarrow.lib.Array.from_pandas
File "pyarrow/array.pxi", line 312, in pyarrow.lib.array
File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 122, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Expected bytes, got a 'tuple' object
I have tried converting the tuple column to string with df = df.astype({"items":str})
and adding a table_schema
param to the pandas_gbq.to_gbq...
line but I keep getting this same error.
I have also tried replacing the pandas_gbq.to_gbq...
line with the bq_client.load_table_from_dataframe
method described here but still get the same pyarrow.lib.ArrowTypeError: Expected bytes, got a 'tuple' object
error...