44

I want to create a repository and Commit a few files to it via any Python package. How do I do?

I do not understand how to add files for commit.

Jai Pandya
  • 2,129
  • 18
  • 29
Denis SkS
  • 816
  • 2
  • 7
  • 15

7 Answers7

38

Solution using the requests library:

NOTES: I use the requests library to do the calls to GitHub REST API v3.

1. Get the last commit SHA of a specific branch

# GET /repos/:owner/:repo/branches/:branch_name
last_commit_sha = response.json()['commit']['sha']

2. Create the blobs with the file's content (encoding base64 or utf-8)

# POST /repos/:owner/:repo/git/blobs
# {
#  "content": "aGVsbG8gd29ybGQK",
#  "encoding": "base64"
#}
base64_blob_sha = response.json()['sha']

# POST /repos/:owner/:repo/git/blobs
# {
#  "content": "hello world",
#  "encoding": "utf-8"
#}
utf8_blob_sha = response.json()['sha']

3. Create a tree which defines the folder structure

# POST repos/:owner/:repo/git/trees/
# {
#   "base_tree": last_commit_sha,
#   "tree": [
#     {
#       "path": "myfolder/base64file.txt",
#       "mode": "100644",
#       "type": "blob",
#       "sha": base64_blob_sha
#     },
#     {
#       "path": "file-utf8.txt",
#       "mode": "100644",
#       "type": "blob",
#       "sha": utf8_blob_sha
#     }
#   ]
# }
tree_sha = response.json()['sha']

4. Create the commit

# POST /repos/:owner/:repo/git/commits
# {
#   "message": "Add new files at once programatically",
#   "author": {
#     "name": "Jan-Michael Vincent",
#     "email": "JanQuadrantVincent16@rickandmorty.com"
#   },
#   "parents": [
#     last_commit_sha
#   ],
#   "tree": tree_sha
# }
new_commit_sha = response.json()['sha']

5. Update the reference of your branch to point to the new commit SHA (on master branch example)

# PATCH /repos/:owner/:repo/git/refs/heads/:branch
# {
#     "sha": new_commit_sha
# }

Finally, for a more advanced setup read the docs.

gaddman
  • 59
  • 7
Arthur Miranda
  • 389
  • 3
  • 3
  • 1
    Thanks for this. It's difficult to follow just a description on how to do this and the code works and made it clear. – danl Nov 02 '20 at 13:24
  • 2
    These are just HTTP request/response packages. It doesn't really explain how to do it or what they mean. – ingyhere Nov 24 '20 at 17:12
  • I have a problem, When I created a new commit that consists of 2 files added to the repo, it deletes everything in the repo. Only the 2 files in new commits get uploaded. How can I prevent my new commit not deleting files and folders available in the repo – SANDEEP S S Apr 14 '22 at 07:18
  • Hey, the link for the "requests library" is broken. – nyxz Jun 21 '22 at 09:32
  • 1
    @SANDEEPSS - I had the same issue, it's because you've forgotten to add the `base_tree` in step 3. Found in the docs: "If not provided, GitHub will create a new Git tree object from only the entries defined in the tree parameter. If you create a new commit pointing to such a tree, then all files which were a part of the parent commit's tree and were not defined in the tree parameter will be listed as deleted by the new commit." – Toby Smith Jul 03 '22 at 19:50
  • 1
    You can skip step #2 if you're willing to use utf-8 rather than base64. In step #3, don't provide a `sha` and instead provide a `content`. – Toby Smith Jul 03 '22 at 20:16
  • This helped me a lot. The GitHub API documentation is quite horrible, a long list of operations without any decent TOC, hard to find the operation you're looking for, and I'm missing descriptions of sequences of operations and how they fit together. Translating `git add .; git commit; git push;` into this 5-step process wasn't obvious. Cheers! – JHH Mar 03 '23 at 07:33
20

You can see if the new update GitHub CRUD API (May 2013) can help

The repository contents API has allowed reading files for a while. Now you can easily commit changes to single files, just like you can in the web UI.

Starting today, these methods are available to you:

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 1
    this is the most easily solution. Libraries you can find here: http://developer.github.com/libraries/#python – timaschew Jan 24 '14 at 10:56
19

Here is a complete snippet:

def push_to_github(filename, repo, branch, token):
    url="https://api.github.com/repos/"+repo+"/contents/"+filename

    base64content=base64.b64encode(open(filename,"rb").read())

    data = requests.get(url+'?ref='+branch, headers = {"Authorization": "token "+token}).json()
    sha = data['sha']

    if base64content.decode('utf-8')+"\n" != data['content']:
        message = json.dumps({"message":"update",
                            "branch": branch,
                            "content": base64content.decode("utf-8") ,
                            "sha": sha
                            })

        resp=requests.put(url, data = message, headers = {"Content-Type": "application/json", "Authorization": "token "+token})

        print(resp)
    else:
        print("nothing to update")

token = "lskdlfszezeirzoherkzjehrkzjrzerzer"
filename="foo.txt"
repo = "you/test"
branch="master"

push_to_github(filename, repo, branch, token)
Martin Monperrus
  • 1,845
  • 2
  • 19
  • 28
  • This code will fail when file is not exists in repository. This will work if you are updating existing file. For new file it will throw error as Not Found. – Nishant Nawarkhede Jun 08 '21 at 19:05
  • What is the correct way to use this API endpoint to create a new file? https://docs.github.com/en/rest/reference/repos#create-or-update-file-contents The endpoint is for creation too, but using it like this answer only allows updating an existing file. – mbtamuli Jul 05 '21 at 12:49
12

Github provides a Git database API that gives you access to read and write raw objects and to list and update your references (branch heads and tags). For a better understanding of the topic, I would highly recommend you reading Git Internals chapter of Pro Git book.

As per the documentation, it is a 7 steps process to commit a change to a file in your repository:

  1. get the current commit object
  2. retrieve the tree it points to
  3. retrieve the content of the blob object that tree has for that particular file path
  4. change the content somehow and post a new blob object with that new content, getting a blob SHA back
  5. post a new tree object with that file path pointer replaced with your new blob SHA getting a tree SHA back
  6. create a new commit object with the current commit SHA as the parent and the new tree SHA, getting a commit SHA back
  7. update the reference of your branch to point to the new commit SHA

This blog does a great job at explaining this process using perl. For a python implementation, you can use PyGithub library.

Jai Pandya
  • 2,129
  • 18
  • 29
3

Based on previous answer, here is a complete example. Note that you need to use POST if you upload the commit to a new branch, or PATCH to upload to an existing one.


    import whatsneeded
    
    GITHUB_TOKEN = "WHATEVERWILLBEWILLBE"
    
    def github_request(method, url, headers=None, data=None, params=None):
        """Execute a request to the GitHUB API, handling redirect"""
        if not headers:
            headers = {}
        headers.update({
            "User-Agent": "Agent 007",
            "Authorization": "Bearer " + GITHUB_TOKEN,
        })
    
        url_parsed = urllib.parse.urlparse(url)
        url_path = url_parsed.path
        if params:
            url_path += "?" + urllib.parse.urlencode(params)
    
        data = data and json.dumps(data)
        conn = http.client.HTTPSConnection(url_parsed.hostname)
        conn.request(method, url_path, body=data, headers=headers)
        response = conn.getresponse()
        if response.status == 302:
            return github_request(method, response.headers["Location"])
    
        if response.status >= 400:
            headers.pop('Authorization', None)
            raise Exception(
                f"Error: {response.status} - {json.loads(response.read())} - {method} - {url} - {data} - {headers}"
            )
    
        return (response, json.loads(response.read().decode()))
      
    def upload_to_github(repository, src, dst, author_name, author_email, git_message, branch="heads/master"):
        # Get last commit SHA of a branch
        resp, jeez = github_request("GET", f"/repos/{repository}/git/ref/{branch}")
        last_commit_sha = jeez["object"]["sha"]
        print("Last commit SHA: " + last_commit_sha)
    
        base64content = base64.b64encode(open(src, "rb").read())
        resp, jeez = github_request(
            "POST",
            f"/repos/{repository}/git/blobs",
            data={
                "content": base64content.decode(),
                "encoding": "base64"
            },
        )
        blob_content_sha = jeez["sha"]
    
        resp, jeez = github_request(
            "POST",
            f"/repos/{repository}/git/trees",
            data={
                "base_tree":
                last_commit_sha,
                "tree": [{
                    "path": dst,
                    "mode": "100644",
                    "type": "blob",
                    "sha": blob_content_sha,
                }],
            },
        )
        tree_sha = jeez["sha"]
    
        resp, jeez = github_request(
            "POST",
            f"/repos/{repository}/git/commits",
            data={
                "message": git_message,
                "author": {
                    "name": author_name,
                    "email": author_email,
                },
                "parents": [last_commit_sha],
                "tree": tree_sha,
            },
        )
        new_commit_sha = jeez["sha"]
    
        resp, jeez = github_request(
            "PATCH",
            f"/repos/{repository}/git/refs/{branch}",
            data={"sha": new_commit_sha},
        )
        return (resp, jeez)

Chmouel Boudjnah
  • 2,541
  • 3
  • 24
  • 28
1

I have created an example for committing with multiple files using Python:

import datetime
import os
import github
   
# If you run this example using your personal token the commit is not going to be verified.
# It only works for commits made using a token generated for a bot/app 
# during the workflow job execution.

def main(repo_token, branch):

    gh = github.Github(repo_token)

    repository = "josecelano/pygithub"

    remote_repo = gh.get_repo(repository)

    # Update files:
    #   data/example-04/latest_datetime_01.txt
    #   data/example-04/latest_datetime_02.txt
    # with the current date.

    file_to_update_01 = "data/example-04/latest_datetime_01.txt"
    file_to_update_02 = "data/example-04/latest_datetime_02.txt"

    now = datetime.datetime.now()

    file_to_update_01_content = str(now)
    file_to_update_02_content = str(now)

    blob1 = remote_repo.create_git_blob(file_to_update_01_content, "utf-8")
    element1 = github.InputGitTreeElement(
        path=file_to_update_01, mode='100644', type='blob', sha=blob1.sha)

    blob2 = remote_repo.create_git_blob(file_to_update_02_content, "utf-8")
    element2 = github.InputGitTreeElement(
        path=file_to_update_02, mode='100644', type='blob', sha=blob2.sha)

    commit_message = f'Example 04: update datetime to {now}'

    branch_sha = remote_repo.get_branch(branch).commit.sha
   
    base_tree = remote_repo.get_git_tree(sha=branch_sha)
 
    tree = remote_repo.create_git_tree([element1, element2], base_tree)

    parent = remote_repo.get_git_commit(sha=branch_sha)

    commit = remote_repo.create_git_commit(commit_message, tree, [parent])

    branch_refs = remote_repo.get_git_ref(f'heads/{branch}')

    branch_refs.edit(sha=commit.sha)
Jose Celano
  • 539
  • 1
  • 7
  • 14
0

I'm on Google App Engine (GAE) so beside of python, I can create a new file, update it, even delete it via a commit and push into my repo in GitHub with GitHub API v3 in php, java and go.

Checking and reviewing some of the available third party libraries to create like the example script that presented in perl, I would recommend to use the following:

As you aware, you can get one site per GitHub account and organization, and unlimited project sites where the websites are hosted directly from your repo and powered by Jekyll as default.

Combining Jekyll, Webhooks, and GitHub API Script on GAE, along with an appropriate GAE Setting, it will give you a wide possibility like calling external script and create a dynamic page on GitHub.

Other than GAE, there is also an option run it on Heroku. Use JekyllBot that lives on a (free) Heroku instance to silently generates JSON files for each post and pushing the changes back to GitHub.

Community
  • 1
  • 1
eQ19
  • 9,880
  • 3
  • 65
  • 77