I have asked a previous question about doing a multi-level aggregation query on the X-axis here: Get the top patent countries, codes in a BQ public dataset.
Here is how the query (copied from the accepted answer works) to get:
Top 2 Countries by Count, and within those countries, top 2 Codes by Count
WITH A AS (
SELECT country_code
FROM `patents-public-data.patents.publications`
GROUP BY country_code
ORDER BY COUNT(1) DESC
LIMIT 2
), B AS (
SELECT
country_code,
application_kind,
COUNT(1) application_kind_count
FROM `patents-public-data.patents.publications`
WHERE country_code IN (SELECT country_code FROM A)
GROUP BY country_code, application_kind
), C AS (
SELECT
country_code,
application_kind,
application_kind_count,
DENSE_RANK() OVER(PARTITION BY country_code ORDER BY application_kind_count DESC) AS application_kind_rank
FROM B
)
SELECT
country_code,
application_kind,
application_kind_count
FROM C
WHERE application_kind_rank <= 2
And I get something like:
country_code application_kind count
JP A 125
JP U 124
CN A 118
CN U 101
Now I would like to add the following pivot on the y-axis: to get the following:
- X: Top 2 Countries by Count, and within those countries, top 2 Codes by Count
- Y: Top 2 family_id by Count, Top 2 priority_date by Count
The final results would then look like:
I am able to build the Y-query in a second query --
WITH A AS (
SELECT family_id
FROM `patents-public-data.patents.publications`
GROUP BY family_id
ORDER BY COUNT(1) DESC
LIMIT 2
), B AS (
SELECT
family_id,
priority_date,
COUNT(1) priority_date_count
FROM `patents-public-data.patents.publications`
WHERE family_id IN (SELECT family_id FROM A)
GROUP BY family_id, priority_date
), C AS (
SELECT
family_id,
priority_date,
priority_date_count,
DENSE_RANK() OVER(PARTITION BY family_id ORDER BY priority_date_count DESC) AS priority_date_rank
FROM B
)
SELECT
family_id,
priority_date,
priority_date_count
FROM C
WHERE priority_date_rank <= 2
However, I am not quite sure how to merge them together, in a single query or in two.