113

A BigQuery table has schema which can be viewed in the web UI, updated, or used to load data with the bq tool as a JSON file. However, I can't find a way to dump this schema from an existing table to a JSON file (preferably from the command-line). Is that possible?

Daniel Waechter
  • 2,574
  • 2
  • 21
  • 21

8 Answers8

172

a way to dump schema from an existing table to a JSON file (preferably from the command-line). Is that possible?

try below

bq show bigquery-public-data:samples.wikipedia  

You can use –format flag to prettify output

--format: none|json|prettyjson|csv|sparse|pretty:

Format for command output. Options include:

none:       ...
pretty:     formatted table output  
sparse:     simpler table output  
prettyjson: easy-to-read JSON format  
json:       maximally compact JSON  
csv:        csv format with header   

The first three are intended to be human-readable, and the latter three are for passing to another program. If no format is selected, one will be chosen based on the command run.

Realized I provided partial answer :o)

Below does what PO wanted

bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq '.schema.fields' 
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Thank you. I kept looking for other keywords like "export" and "dump", as well as the word "schema", and none of the docs for "show" had that. – Daniel Waechter Apr 03 '17 at 22:29
  • 1
    i would recommend to explore bq command directly in Google Cloud SDK Shell. Just start with bq --help and ... :o) – Mikhail Berlyant Apr 03 '17 at 22:30
  • 6
    For posterity, this command does what I wanted: `bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq '.schema.fields'` – Daniel Waechter Apr 04 '17 at 16:13
  • is there any way forward the output into a text file? my schema definitions are to large for the terminal – flowoo Jul 12 '17 at 09:39
  • 3
    just add "> yourfile.json" at the end without quotation marks – fpopic Dec 19 '17 at 15:43
  • Here is a link to download jq, if its not already on your system: https://stedolan.github.io/jq/download/ – Jas Apr 24 '18 at 22:09
  • 1
    With windows I found that the quotation mark `"` is needed instead of apostrophe `'`, so as follows: `bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq ".schema.fields" ` – philshem Sep 09 '19 at 09:04
  • why I am getting this error when using --format as csv or pretty "parse error: Invalid numeric literal at line 1, column 5" – vikrant rana Dec 18 '20 at 18:02
  • @vikrantrana The `jq` command parses json. If you want another format, you have to omit that part. – Rémi Svahn Mar 03 '21 at 12:44
114

You can add the flag --schema[1] in order to avoid table data information.

bq show --schema --format=prettyjson [PROJECT_ID]:[DATASET].[TABLE] > [SCHEMA_FILE]

bq show --schema --format=prettyjson myprojectid:mydataset.mytable > /tmp/myschema.json

[1] https://cloud.google.com/bigquery/docs/managing-table-schemas

Idan Gozlan
  • 3,173
  • 3
  • 30
  • 47
bsmarcosj
  • 1,590
  • 1
  • 11
  • 21
  • 3
    Excellent! It looks like that was added a few months after I asked this question, in Cloud SDK version 165. Much better than relying on `jq`. – Daniel Waechter Jun 22 '18 at 18:18
22
  1. select table on the bq UI.
  2. select columns you would want to export schema for.
  3. use the copy menu to copy schema as JSON.

table schema

Anthony Awuley
  • 3,455
  • 30
  • 20
11

Answer update

Since October 2020, you can also run a SQL query on INFORMATION_SCHEMA.COLUMNS which is kind of an introspective functionality.

SELECT *
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS

and nest the data using an aggregation function such as

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type)) as columns
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS
GROUP BY table_name

The are also interesting metadata in INFORMATION_SCHEMA.VIEWS if you also need the source code from your views.

Then hit save results / JSON from the BigQuery interface, or wrap it into the bq query command line in your case.

Source: BigQuery release notes

Michel Hua
  • 1,614
  • 2
  • 23
  • 44
4

You can use REST API call to get BigQuery table schema as JSON. Documentation link: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/get

curl 'https://bigquery.googleapis.com/bigquery/v2/projects/project-name/datasets/dataset-name/tables/table-name' \
     --header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
     --header 'Accept: application/json' \
     --compressed
Soumendra Mishra
  • 3,483
  • 1
  • 12
  • 38
  • Thank you for this - I was looking for the API version. Is there a way to do this without an HTTP call? Is a function like this just what all more 'native' looking functions of the API look like under the hood? (I mean that normal functions are not HTTP calls). I just don't want it to be slow and "calls" seems slow. – makmak Aug 22 '20 at 17:15
  • When I usually load data etc I don't need to think of Authentication for instance. @Soumendra Mishra – makmak Aug 22 '20 at 18:27
2

As of 15th May 2022, this worked:

  1. In google cloud, Go to cloud shell
  2. Select the project from drop down (left) of cloud shell
  3. Use below command bq show --schema --format=prettyjson .
0

The following bash script & sql always helped me solve the problem to extract all tables schema to JSON file from a dataset:

#!/bin/bash
#gen-default-schema.sh
input=$1
source_type=$2
result=tables_${source_type}.result

bq query --format=csv --use_legacy_sql=false --flagfile=$input | awk '{if(NR>1)print}' > $result

while IFS= read -r line
do
    tbl_name=`echo "$line" | awk -F. '{print $NF}'`
    schema_file=`echo "$tbl_name" | cut -d'_' -f2-`.schema
    echo $schema_file
    bq show --schema --format=prettyjson $line > ./temp/${source_type}/${schema_file}
    echo "done"
done < "$result"

Input file example.sql ($1)

SELECT
  table_catalog || ":" || table_schema || "." || table_name
FROM (
  SELECT
    table_catalog,
    table_schema,
    table_name
  FROM
    `project-id`.<dataset_id>.INFORMATION_SCHEMA.TABLES
  ORDER BY
    table_name ASC )

To run:

$bash gen-default-schema.sh example.sql example

This will place all the JSON schema under ./temp folder

Logan
  • 1,331
  • 3
  • 18
  • 41
0

IF you want to do this from google cloud-console then a short SQL query can achieve this.

It'll give you all the info from schema and you can change the STRUCT( ... ) with https://cloud.google.com/bigquery/docs/information-schema-column-field-paths#schema as you wish.

Alternatively use INFORMATION_SCHEMA.<something> with other views to get different meta info to JSON.

As @Michel Hua said in their answer, select Query results -> JSON in bigquery to get JSON after running the SQL query

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type, description)) as columns
FROM `your-project-id`.your_dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS 
WHERE table_name = 'your_table_name' 
GROUP BY table_name
eemilk
  • 1,375
  • 13
  • 17