zifan is right.
You can create a query per day for the last 30 days; or two queries per day (one each 12 hours); and so forth. The lower the interval, the more the query calls. At the same time, the more the repositories you catch.
Below an example in Python. It runs a curl
call, so you can easily translate it to different languages.
import requests
from datetime import datetime, timedelta
URL = 'https://api.github.com/search/repositories?q=is:public created:{}..{}'
HEADERS = {'Authorization': 'token <PASTE_HERE_GITHUB_ACCESS_TOKEN>'}
since = datetime.today() - timedelta(days=30) # Since 30 days ago
until = since + timedelta(days=1) # Until 29 days ago
while until < datetime.today():
day_url = URL.format(since.strftime('%Y-%m-%d'), until.strftime('%Y-%m-%d'))
r = requests.get(day_url, headers=HEADERS)
print(f'Repositories created between {since} and {until}: {r.json().get("total_count")}')
# Update dates for the next search
since = until
until = since + timedelta(days=1)
Of course, the number of repositories might still be too large. In that case, try
- to use pagination;
- to reduce the interval SINCE..UNTIL, as well as the timedelta;
- to add further filters in the query, for example: exclude archived and forked repositories, get repositories with a minimum number of stars only, and so forth.
Take a look here for an example.
Here is a Python tool to collect repositories from Github: https://github.com/radon-h2020/radon-repositories-collector