Taking inspiration from Simon Willison's 'Paginating through the GitHub GraphQL API with Python' here's what I've been doing to paginate my queries:
query {
node(id: "PROJECT_ID") {
... on ProjectNext {
items(first: 100 after: CURSOR) {
pageInfo {
hasNextPage
endCursor
}
nodes {
title
fieldValues(first: 8) {
nodes {
value
}
}
content {
... on Issue {
number
labels(first: 50) {
nodes {
name
}}}}}}}}}
In my Python code I'm splicing in PROJECT_ID
with a variable set to the project ID I'm referencing.
For the cursor after: CURSOR
is replaced with ""
initially, and then for the next page I set cursor = 'after:\\"' + response["data"]["node"]["items"]["pageInfo"]["endCursor"] + '\\"'
My full code is in the atdumpmemex module of my dump_cards utility.
The key here is to get pageInfo
along with other relevant nodes, and then grab the endCursor
each time hasNextPage
is true so that it can be fed into the query for the next iteration.
pageInfo will look something like:
"pageInfo": {
"hasNextPage": false,
"endCursor": "Y3Vyc29yOnYyOpHOAAhOsg=="
}
At the moment the endCursor
is base64 encoded cursor:v2:XYZ
, but don't rely on that as GitHub have moved other IDs from being base64 encoded to other schemes.