Github Graphql query: Reaching null pointer after 1000 outputs (or 10 requests * 100 items per request)

Question

I'm the lead student-researcher on a team trying to analyze and mine GitHub repositories. We're trying to get (the repo_owner and repo_name) for every project hosted on Github that meets the following criteria:

query MyQuery {
  search(query: "language:Python", type: REPOSITORY, 
    first: 100
  ) {
    pageInfo {
      endCursor
      hasNextPage
    }
    edges {
      node {
        ... on Repository {
          nameWithOwner
          issues {
            totalCount
          }
          defaultBranchRef {
            target {
              ... on Commit {
                history(first: 0) {
                  totalCount
                }
              }
            }
          }
        }
      }
    }
  }
}

We are able to iterate through the cursors 10 times. But when we reach cursor "Y3Vyc29yOjEwMDA="

query MyQuery {
  search(query: "language:Python", type: REPOSITORY, 
    first: 100, after:"Y3Vyc29yOjEwMDA="
  ) {
    pageInfo {
      endCursor
      hasNextPage
    }
    edges {
      node {
        ... on Repository {
          nameWithOwner
          issues {
            totalCount
          }
          defaultBranchRef {
            target {
              ... on Commit {
                history(first: 0) {
                  totalCount
                }
              }
            }
          }
        }
      }
    }
  }
}

We get the following response:

{
  "data": {
    "search": {
      "pageInfo": {
        "endCursor": null,
        "hasNextPage": false
      },
      "edges": []
    }
  }
}

I know from a quick advanced search on Github that there are currently ~4,000,000 python-language public repositories hosted on the site. We can only get 1000 before we encounter this null cursor.

Please let us know if there is a work-around for this problem. We'd like to continue to use v4 API because of the minimalistic data output (i.e., it only gives us what we want: repo_owner and repo name along with issue count and commit count).

Thank you for your help!

Does this answer your question? [github search limit results](https://stackoverflow.com/questions/37602893/github-search-limit-results) — Edeki Okoh, Mar 17 '20 at 23:31

Github Graphql query: Reaching null pointer after 1000 outputs (or 10 requests * 100 items per request)

0 Answers0