Create Partition table in Big Query

Question

Can anyone please suggest how to create partition table in Big Query ?. Example: Suppose I have one log data in google storage for the year of 2016. I stored all data in one bucket partitioned by year , month and date wise. Here I want create table with partitioned by date. Thanks in Advance

score 2 · Answer 1 · answered Sep 01 '16 at 05:21

2

There are two options:

Option 1
You can load each daily file into separate respective table with name as YourLogs_YYYYMMDD
See details on how to Load Data from Cloud Storage

After tables created, you can access them either using Table wildcard functions (Legacy SQL) or using Wildcard Table (Standar SQL). See also Querying Multiple Tables Using a Wildcard Table for more examples

Option 2
You can create Date-Partitioned Table (just one table - YourLogs) - but you still will need to load each daily file into respective partition - see Creating and Updating Date-Partitioned Tables

After table is loaded you can easily Query Date-Partitioned Tables

answered Sep 01 '16 at 05:21

Mikhail Berlyant

165,386
8
154
230

Thanks Mikhil, Can you please provide one example for this ? – Soma Sekhar Kuruva Sep 01 '16 at 06:51
1

you can find plenty examples in links. try and ask specific questions then - otherwise it is too broad!! – Mikhail Berlyant Sep 01 '16 at 12:42
I have one year history data in single bucket in gs, data stored in day wise, I want to create table with partitioned by date from big query web UI. I can create normal table, but while querying it will processing all data. So here I want to process particular date range. How can update partition table. Is it same base table as partition table or I suppose to create new table with partition? How can update partition table with date range partitions ? Please help me in this ? Thanks in advance. – Soma Sekhar Kuruva Sep 01 '16 at 13:39
Suppose base table name as log_base external table pointing to google storage. Want to modify this log_base table to partition table. What is the best approach for this? – Soma Sekhar Kuruva Sep 01 '16 at 13:42

score 2 · Accepted Answer · edited May 23 '17 at 10:29

2

Documentation for partitioned tables is here: https://cloud.google.com/bigquery/docs/creating-partitioned-tables

In this case, you'd create a partitioned table and populate the partitions with the data. You can run a query job that reads from GCS (and filters data for the specific date) and writes to the corresponding partition of a table. For example, to load data for May 1st, 2016 -- you'd specify the destination_table as table$20160501.

Currently, you'll have to run several query jobs to achieve this process. Please note that you'll be charged for each query job based on bytes processed.

Please see this post for some more details:

Migrating from non-partitioned to Partitioned tables

edited May 23 '17 at 10:29

Community

1
1

answered Sep 01 '16 at 05:24

Pavan Edara

2,285
1
12
12

Thanks for the reply. Normally while I was working in Big Data, there is one option to create external table with partitioned by date ("Create table foo(abc int) partitioned by (year int, month string , date int)"). Then "Alter table foo add partition(2016,"May","01")"" . Then we can query against table as "select * from table where year=2016 and month="May" and date=01". Here with one table we can create multiple partitions. In big query I was searching for is there any option like that. If you can please give me sample table creation. Thanks lot – Soma Sekhar Kuruva Sep 01 '16 at 05:51
You'll have to create the table separately from the job that populates the partitions. For creating table, please see here: https://cloud.google.com/bigquery/docs/creating-partitioned-tables#creating_a_partitioned_table – Pavan Edara Sep 01 '16 at 05:53
It means suppose if I have 1 year history data then I suppose to create 356 tables ?? – Soma Sekhar Kuruva Sep 01 '16 at 06:28
The suggestion is to create 365 partitions in a single partitioned table (as opposed to 365 tables). You can use a partition as a destination table. For example, if the partitioned table is T, the name T$20160501 corresponds to the partition for may 1st 2016 in that table. – Pavan Edara Sep 01 '16 at 06:30
How to avoid paying for every day queries? When this functionality will be available? – Pentium10 Sep 01 '16 at 06:30
I have one year history data in google storage . I already placed that data in google storage and all data separated by date. For that one howmany tables I should create ? Please give me schema for one table. – Soma Sekhar Kuruva Sep 01 '16 at 06:30
1

Answering question from Pentium10 -- we are actively working on removing the need for running multiple queries. I am unable to provide an ETA, but we understand this is an important feature and are trying to support it as soon as we can. – Pavan Edara Sep 01 '16 at 06:33
Here I have one year history data, I put all this data into one bucket , in side that I partitioned by date . While creating external table pointing to google storage I'm suppose to give location, then I can give one full bucket(gs://log_bucket/*). I created table with schema and location. Now I want to alter with partitions. Like as u suggested "T$20160501 " , in this case how it will take exact date location means for 20160501 (gs://log_bucket/2016/05/01/*) ? Should we provide manually ? – Soma Sekhar Kuruva Sep 01 '16 at 06:50
You'll need to specify which data needs to go into a specific partition. For example if you have a timestamp field (ts) in your data, you might run a query job (assuming your external table is external_table): SELECT * from external_table WHERE DATE(ts) = '2016-05-01' with destination_table as T$20160501. You can then do this for all the dates you want. There are some ways to reduce the cost of doing this operation (discussed in http://stackoverflow.com/questions/38993877/migrating-from-non-partitioned-to-partitioned-tables/39001706#39001706) – Pavan Edara Sep 01 '16 at 06:59
Can you please explain me how to do from WEB UI ? Thanks – Soma Sekhar Kuruva Sep 01 '16 at 11:45

score 2 · Answer 3 · answered Jun 28 '17 at 13:26

2

Having partitions for an External Table is not allowed as for now. There is a Feature Request for it:
https://issuetracker.google.com/issues/62993684
(please vote for it if you're interested in it!)

Google says that they are considering it.

answered Jun 28 '17 at 13:26

Sourygna

679
4
14

Create Partition table in Big Query

3 Answers3