My guess is that patent can be cited after it's status is Application - so instead of using initial number CN-201510747352
- you should use app/pub number when status is Application - Also you need to apply not only distinct counting but also excluding counting same app with -A or -B or etc. suffix - that is why you will see use of the regex_extract function
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number LIKE ('CN-105233911%')
GROUP BY c.publication_number
with result
Row Pub CitedBy
1 CN-105233911-A 10
... If I only have the application data, how can I realize it?
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number IN (
SELECT publication_number
FROM `patents-public-data.patents.publications`
WHERE application_number IN ('CN-201510747352-A')
)
GROUP BY c.publication_number