27

I need to get the contents of a file hosted in a GitHub repo. I'd prefer to get a JSON response with metadata along with it. I've tried numerous URLs with cURL with to only get a response of {"message":"Not Found"}. I just need the URL structure. If it matters, it's from an organization on GitHub. Here's what I think should work but doesn't:

http://api.github.com/repos/<organization>/<repository>/git/branches/<branch>/<file>
Tyler Crompton
  • 12,284
  • 14
  • 65
  • 94
  • 1
    see http://stackoverflow.com/questions/9240961/github-jsonp-source-code-api/9241535#9241535 – nulltoken Feb 14 '12 at 06:32
  • Three requests for a simple JSON response? Good lawd. Not intuitive at all. Surely there's a more elegant way. – Tyler Crompton Feb 14 '12 at 08:53
  • 1
    This is probably one of the weakest bits of their API. You can navigate the structure using their Trees API (at Git Data in docs). In order to use that you'll need a sha. You can dig that out of repo branches. Perhaps it is easier for you to use raw.github.com like this? raw.github.com/:user/:repo/:branch/:filename . You can easily combine these two approaches to figure out if some file exists and then to fetch it. – Juho Vepsäläinen May 23 '12 at 12:07
  • Yeah, I found out about that a couple of days ago. I need the file structure, though. Basically, I want to link to the Github files on my website. Think of it as an index page for my Github files. – Tyler Crompton May 23 '12 at 13:43

2 Answers2

37

As the description (located at http://developer.github.com/v3/repos/contents/) says:

/repos/:owner/:repo/contents/:path

An ajax code will be:

$.ajax({
    url: readme_uri,
    dataType: 'jsonp',
    success: function(results)
    {
        var content = results.data.content;
    });

Replace the readme_uri by the proper /repos/:owner/:repo/contents/:path.

atejeda
  • 3,705
  • 1
  • 18
  • 11
  • Is this new? I swear this wasn't here when I asked. I looked all over the dev pages for this. Thanks. – Tyler Crompton Feb 08 '13 at 14:42
  • 2
    Looks like GitHub is sending file content encoded in Base64... – taseenb Mar 30 '14 at 02:50
  • 32
    @taseenb use `https://raw.githubusercontent.com/:owner/:repo/master/:path` to get raw (binary, not Base64) – Peter Krauss Aug 31 '15 at 21:44
  • @Peter where did you find the link you mentioned in your comment? Saved my day :) It was horrible converting base64 encoded content back to raw – rohanagarwal Jun 19 '17 at 10:41
  • @rohanagarwal If you browse to a file on GitHub and then click the Raw link, that's where you'll go. [What do raw.githubusercontent.com URLs represent?](https://stackoverflow.com/questions/39065921/what-do-raw-githubusercontent-com-urls-represent) – Hanabi Nov 16 '21 at 06:28
  • 2
    You can request the raw content by setting the `Accept` header to `application/vnd.github.v3.raw` – wilrnh Nov 24 '21 at 02:54
29

This GitHub API page provides the full reference. The API endpoint for reading a file:

https://api.github.com/repos/{username}/{repository_name}/contents/{file_path}
{
  "encoding": "base64",
  "size": 5362,
  "name": "README.md",
  "content": "encoded content ...",
  "sha": "3d21ec53a331a6f037a91c368710b99387d012c1",
  ...
}
  • Consider using a personal access token
    • Rate-limits (up to 60 per-hour for anonymous, up to 5,000 per-hour for authenticated) read more
    • Enable accessing files in private repos
  • The file content in the response is base64 encoded string

Using curl, jq

Reading https://github.com/airbnb/javascript/blob/master/package.json using GitHub's API via curl and jq:

curl https://api.github.com/repos/airbnb/javascript/contents/package.json | jq -r ".content" | base64 --decode

Using Python

Reading https://github.com/airbnb/javascript/blob/master/package.json using GitHub's API in Python:

import base64
import json
import requests
import os


def github_read_file(username, repository_name, file_path, github_token=None):
    headers = {}
    if github_token:
        headers['Authorization'] = f"token {github_token}"
        
    url = f'https://api.github.com/repos/{username}/{repository_name}/contents/{file_path}'
    r = requests.get(url, headers=headers)
    r.raise_for_status()
    data = r.json()
    file_content = data['content']
    file_content_encoding = data.get('encoding')
    if file_content_encoding == 'base64':
        file_content = base64.b64decode(file_content).decode()

    return file_content


def main():
    github_token = os.environ['GITHUB_TOKEN']
    username = 'airbnb'
    repository_name = 'javascript'
    file_path = 'package.json'
    file_content = github_read_file(username, repository_name, file_path, github_token=github_token)
    data = json.loads(file_content)
    print(data['name'])


if __name__ == '__main__':
    main()
  • Define an environment variable GITHUB_TOKEN before running
Jossef Harush Kadouri
  • 32,361
  • 10
  • 130
  • 129
  • 3
    Note: to have the contents directly instead of a base64 version, pass `'Accept: application/vnd.github.v3.raw'` request header: `curl -H 'Accept: application/vnd.github.v3.raw' 'https://api.github.com/repos/airbnb/javascript/contents/package.json'` (no need to pipe to `| jq -r ".content" | base64 --decode`). – jakub.g May 31 '23 at 21:56