Every weekend I add a few files to a google bucket
and then run something from the command line to "update" a table with the new data.
By "update" I mean that I delete the table and then remake it by using all the files in the bucket
, including the new files.
I do everything by using python
to execute the following command in the Windows
command line:
bq mk --table --project_id=hippo_fence-5412 mouse_data.partition_test gs://mybucket/mouse_data/* measurement_date:TIMESTAMP,symbol:STRING,height:FLOAT,weight:FLOAT,age:FLOAT,response_time:FLOAT
This table is getting massive (>200 GB) and it would be much cheaper for the lab to use partitioned tables.
I've tried a to partition the table in a few ways, including what is recommened by the official docs but I can't make it work.
The most recent command I tried was just inserting --time_partitioning_type=DAY
like:
bq mk --table --project_id=hippo_fence-5412 --time_partitioning_type=DAY mouse_data.partition_test gs://mybucket/mouse_data/* measurement_date:TIMESTAMP,symbol:STRING,height:FLOAT,weight:FLOAT,age:FLOAT,response_time:FLOAT
but that didn't work, giving me the error:
FATAL Flags parsing error: Unknown command line flag 'time_partitioning_type'
How can I make this work?