17

I created a new repository on github.com and then cloned it to my local machine with

git clone https://github.com/usrname/mathematics.git

I added 3 new files under the folder mathematics

$ tree 
.
├── LICENSE
├── numerical_analysis
│   └── regression_analysis
│       ├── simple_regression_analysis.md
│       ├── simple_regression_analysis.png
│       └── simple_regression_analysis.py

Now, I'd like to upload 3 new files to my GitHub using Python, more specifically, PyGithub. Here is what I have tried:

#!/usr/bin/env python
# *-* coding: utf-8 *-*
from github import Github

def main():
    # Step 1: Create a Github instance:
    g = Github("usrname", "passwd")
    repo = g.get_user().get_repo('mathematics')

    # Step 2: Prepare files to upload to GitHub
    files = ['mathematics/numerical_analysis/regression_analysis/simple_regression_analysis.py', 'mathematics/numerical_analysis/regression_analysis/simple_regression_analysis.png']

    # Step 3: Make a commit and push
    commit_message = 'Add simple regression analysis'

    tree = repo.get_git_tree(sha)
    repo.create_git_commit(commit_message, tree, [])
    repo.push()

if __name__ == '__main__':
    main()

I don't know

  • how to get the string sha for repo.get_git_tree
  • how do I make a connection between step 2 and 3, i.e. pushing specific files

Personally, PyGithub documentation is not readable. I am unable to find the right api after searching for long time.

SparkAndShine
  • 17,001
  • 22
  • 90
  • 134
  • 1
    To get the `sha` you'll need to use `hashlib` – Wayne Werner Jul 26 '16 at 16:01
  • 6
    @WayneWerner that's definitely *not* what he should do. The `sha` is computed by `git` and you'll almost certainly get it wrong if you try to compute it yourself. – Brian Malehorn Jul 26 '16 at 16:03
  • @BrianMalehorn I had a bash script that would upload my git commits via CURL and the github api, IIRC - it's not *that* bad. – Wayne Werner Jul 26 '16 at 16:06
  • You used `sha1sum` to compute the git hash? If so I'm impressed to got the format right. – Brian Malehorn Jul 26 '16 at 16:13
  • Why not using a bash script? I guess you should have bash installed along git in your environment. You could then push via ssh with key authentication. – MayeulC Sep 21 '16 at 09:01
  • @MayeulC, I am more familar with Python. It is easier for me to select specific files to submit. – SparkAndShine Sep 21 '16 at 09:24
  • 2
    Then, how about calling `git` directly to interface with it? Or using a python interface such as [GitPython](https://github.com/gitpython-developers/GitPython#gitpython), not necessarily GitHub oriented? Its documentation is indeed... very sparse, and I would not call it usable. – MayeulC Sep 21 '16 at 09:44
  • @MayeulC, thx for this info. I am also considering `subprocess.Popen`. Anyway, I'll try GitPython as you mentioned. BTW, can you answer the question? The bounty is about to expiring in 2 days. – SparkAndShine Sep 21 '16 at 12:01
  • 1
    My understanding of the documentation is that you want to get the branch, from which you get the HEAD commit (from which you get the `sha` values for the commit, and the base tree). with that in hand, call `create_git_tree` passing the HEAD's tree as base, and giving it a list of `InputGitTreeElement` (setting `content`, but leaving `sha` alone) with your modifications. then call `create_git_commit` with your new tree. Finally, you'll need to get the branch ref and update it to your new commit `sha`. It's probably easier to find some library that wraps this one, than do it yourself though. – Hasturkun Sep 21 '16 at 15:47
  • Does this answer your question? [How to create a commit and push into repo with GitHub API v3?](https://stackoverflow.com/questions/11801983/how-to-create-a-commit-and-push-into-repo-with-github-api-v3) – Arthur Miranda Aug 18 '20 at 13:29

7 Answers7

19

I tried to use the GitHub API to commit multiple files. This page for the Git Data API says that it should be "pretty simple". For the results of that investigation, see this answer.

I recommend using something like GitPython:

from git import Repo

repo_dir = 'mathematics'
repo = Repo(repo_dir)
file_list = [
    'numerical_analysis/regression_analysis/simple_regression_analysis.py',
    'numerical_analysis/regression_analysis/simple_regression_analysis.png'
]
commit_message = 'Add simple regression analysis'
repo.index.add(file_list)
repo.index.commit(commit_message)
origin = repo.remote('origin')
origin.push()

Note: This version of the script was run in the parent directory of the repository.

Community
  • 1
  • 1
  • You can only add files in the root or subdirs of the current working dir, not from parent directories. – Juha Untinen Apr 03 '18 at 20:19
  • Well, you can simply do ``os.chdir()`` to work on another dir or Git repo, such as when using a separate repository for backups. The above will also continue to work in the "new" git project. – Juha Untinen Apr 04 '18 at 08:04
  • which path need to add in file list? when i am giving path of image available in my local it show me error filenotfounderror winerror 2 python – Shreya Dec 17 '20 at 12:46
9

Note: This version of the script was called from inside the GIT repository because I removed the repository name from the file paths.

I finally figured out how to use PyGithub to commit multiple files:

import base64
from github import Github
from github import InputGitTreeElement

token = '5bf1fd927dfb8679496a2e6cf00cbe50c1c87145'
g = Github(token)
repo = g.get_user().get_repo('mathematics')
file_list = [
    'numerical_analysis/regression_analysis/simple_regression_analysis.png',
    'numerical_analysis/regression_analysis/simple_regression_analysis.py'
]
commit_message = 'Add simple regression analysis'
master_ref = repo.get_git_ref('heads/master')
master_sha = master_ref.object.sha
base_tree = repo.get_git_tree(master_sha)
element_list = list()
for entry in file_list:
    with open(entry, 'rb') as input_file:
        data = input_file.read()
    if entry.endswith('.png'):
        data = base64.b64encode(data)
    element = InputGitTreeElement(entry, '100644', 'blob', data)
    element_list.append(element)
tree = repo.create_git_tree(element_list, base_tree)
parent = repo.get_git_commit(master_sha)
commit = repo.create_git_commit(commit_message, tree, [parent])
master_ref.edit(commit.sha)
""" An egregious hack to change the PNG contents after the commit """
for entry in file_list:
    with open(entry, 'rb') as input_file:
        data = input_file.read()
    if entry.endswith('.png'):
        old_file = repo.get_contents(entry)
        commit = repo.update_file('/' + entry, 'Update PNG content', data, old_file.sha)

If I try to add the raw data from a PNG file, the call to create_git_tree eventually calls json.dumps in Requester.py, which causes the following exception to be raised:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 0: invalid start byte

I work around this problem by base64 encoding the PNG data and committing that. Later, I use the update_file method to change the PNG data. This results in two separate commits to the repository which is probably not what you want.

3

I can give you some information support, but also one concrete solution.

Here you can find examples of adding new files to your repository, and here is a video tutorial for this.

Below you can see a list of python packages that work with GitHub found on the developer page of GitHub:

But also you can push your files with commands in IPython if you need:

In [1]: import subprocess
In [2]: print subprocess.check_output('git init', shell=True)
Initialized empty Git repository in /home/code/.git/
In [3]: print subprocess.check_output('git add .', shell=True)
In [4]: print subprocess.check_output('git commit -m "a commit"', shell=True)
vlad.rad
  • 1,055
  • 2
  • 10
  • 28
1

Using subprocess, this will do the same work-

import subprocess
subprocess.call(['git', 'add', '-A'])
subprocess.call(['git', 'commit', '-m', '{}'.format(commit_message)])
subprocess.call(['git', 'push', 'https://{}@github.com/user-name/repo.git'.format(token)])

Make sure to use -A or -all to track all the files in the project/even in parent directory. Using 'git add .' will track only the files inside the cwd where this code is written.

fin
  • 77
  • 1
  • 3
0
import subprocess
p = subprocess.Popen("git rev-parse HEAD".split(), stdout=subprocess.PIPE)
out, err = p.communicate()
sha = out.strip()

There's probably a way to do this with PyGithub, but this should work for a quick hack.

Brian Malehorn
  • 2,627
  • 14
  • 15
0

If you do not need pygithub specifically, the dulwich git-library offers high level git commands. For the commands have a look at https://www.dulwich.io/apidocs/dulwich.porcelain.html

janbrohl
  • 2,626
  • 1
  • 17
  • 15
0

If PyGithub's documentation is not usable (and it doesn't look so), and you just want to push a commit (not doing anything fancy with issues, repo configuration, etc.), you would probably be better off directly interfacing with git, either calling the git executable or using a wrapper library such as GitPython.

Using git directly with something such as subprocess.Popen that you mentioned would probably be easier on the leaning curve, but also more difficult in the long term for error handling, etc. since you don't really have nice abstractions to pass around, and would have to do the parsing yourself.

Getting rid of PyGithub also frees you from being tied to GitHub and its API, allowing you to push to any repo, even another folder on your computer.

MayeulC
  • 1,628
  • 17
  • 24