My goal is to track the total number of stars of my repo. However, its repo.name changed over time. How to achieve this with the githubarchive
dataset?
Asked
Active
Viewed 220 times
3

Steren
- 7,311
- 3
- 31
- 51
1 Answers
2
(related to https://stackoverflow.com/a/42930963/132438)
GitHub project names go through changes, so instead of querying by name it's safer to query by id. You could look for a project id in a separate query, or do it altogether in a query like this:
SELECT
COUNT(*) naive_count,
COUNT(DISTINCT actor.id) unique_by_actor_id,
COUNT(DISTINCT actor.login) unique_by_actor_login
FROM `githubarchive.month.*`
WHERE repo.id = (
SELECT repo.id
FROM `githubarchive.month.201702`
WHERE repo.name='bazelbuild/bazel'
LIMIT 1)
AND type = "WatchEvent"

Community
- 1
- 1

Felipe Hoffa
- 54,922
- 16
- 151
- 325