182

What are people's experiences with any of the Git modules for Python? (I know of GitPython, PyGit, and Dulwich - feel free to mention others if you know of them.)

I am writing a program which will have to interact (add, delete, commit) with a Git repository, but have no experience with Git, so one of the things I'm looking for is ease of use/understanding with regards to Git.

The other things I'm primarily interested in are maturity and completeness of the library, a reasonable lack of bugs, continued development, and helpfulness of the documentation and developers.

If you think of something else I might want/need to know, please feel free to mention it.

PTBNL
  • 6,042
  • 4
  • 28
  • 34

11 Answers11

133

While this question was asked a while ago and I don't know the state of the libraries at that point, it is worth mentioning for searchers that GitPython does a good job of abstracting the command line tools so that you don't need to use subprocess. There are some useful built in abstractions that you can use, but for everything else you can do things like:

import git
repo = git.Repo( '/home/me/repodir' )
print repo.git.status()
# checkout and track a remote branch
print repo.git.checkout( 'origin/somebranch', b='somebranch' )
# add a file
print repo.git.add( 'somefile' )
# commit
print repo.git.commit( m='my commit message' )
# now we are one commit ahead
print repo.git.status()

Everything else in GitPython just makes it easier to navigate. I'm fairly well satisfied with this library and appreciate that it is a wrapper on the underlying git tools.

UPDATE: I've switched to using the sh module for not just git but most commandline utilities I need in python. To replicate the above I would do this instead:

import sh
git = sh.git.bake(_cwd='/home/me/repodir')
print git.status()
# checkout and track a remote branch
print git.checkout('-b', 'somebranch')
# add a file
print git.add('somefile')
# commit
print git.commit(m='my commit message')
# now we are one commit ahead
print git.status()
underrun
  • 6,713
  • 2
  • 41
  • 53
  • 2
    The excellent Legit tool uses GitPython: https://github.com/kennethreitz/legit/blob/develop/legit/scm.py – forivall Sep 17 '12 at 17:17
  • 11
    Based on this answer, I just tried my luck with git-python. I find the API strange to deal with. Most of the time you have to fall back to the repo.git.* general interface, and even that does not work properly at times (e.g. `repo.git.branch(b=somebranch)` works but `repo.git.branch(D=somebranch)` does not since a space is missing). I guess I'll implement a subprocess-based general function myself. I'm sad, I had high hopes. :-/ – Christoph Jan 30 '13 at 22:08
  • above I meant `repo.git.checkout(b=somebranch)`, not `branch`, of course. late-night typo, sorry. – Christoph Jan 31 '13 at 10:13
  • 6
    i've switched to using the sh module now with `git = sh.git.bake(_cwd=repopath)`. it works awesomely. – underrun Jan 31 '13 at 14:09
  • 10
    link to sh: http://amoffat.github.io/sh/ really should be part of python stdlib. –  Jun 28 '13 at 19:12
  • 5
    Latest python sh version does not work on Windows. Complete utter fail. – void.pointer Feb 28 '14 at 00:12
  • `sh` is magnificent, thanks! – michel-slm Jan 19 '15 at 11:31
  • 1
    The only problem with sh is clone and pull if you don't have ssh-agent setup. There doesn't seem to be a way of getting the user to enter username / password because terminal interaction is not yet supported in sh - https://github.com/amoffat/sh/issues/92 – Stuart Axon May 03 '16 at 10:59
  • @StuartAxon that's true, but there are methods outside of ssh-agent and the various keyring management. i do think key management is the way to go, but if you are able to do readonly access without auth then git protocol and http(s) work well. – underrun Sep 26 '16 at 22:11
  • +1 for the `sh` suggestion. The ability to execute almost identical command line equivalent Git commands from Python has reduced Git-related errors greatly. The one-to-one relationship hardly imposes a learning curve and eliminates learning odd API syntax -- despite their best efforts. Furthermore, it removes any dependencies on API versions or their shortcomings. Git via ``sh`` is only limited by Git itself. – Frelling Nov 05 '17 at 01:43
  • 3
    What are the advantages that `sh` provides over `subprocess`? – Sid Sep 03 '19 at 19:08
  • 3
    From the GitPython README: "GitPython is not suited for long-running processes (like daemons) as it tends to leak system resources." – Boris Verkhovskiy Jan 21 '20 at 17:25
86

I thought I would answer my own question, since I'm taking a different path than suggested in the answers. Nonetheless, thanks to those who answered.

First, a brief synopsis of my experiences with GitPython, PyGit, and Dulwich:

  • GitPython: After downloading, I got this imported and the appropriate object initialized. However, trying to do what was suggested in the tutorial led to errors. Lacking more documentation, I turned elsewhere.
  • PyGit: This would not even import, and I could find no documentation.
  • Dulwich: Seems to be the most promising (at least for what I wanted and saw). I made some progress with it, more than with GitPython, since its egg comes with Python source. However, after a while, I decided it may just be easier to try what I did.

Also, StGit looks interesting, but I would need the functionality extracted into a separate module and do not want wait for that to happen right now.

In (much) less time than I spent trying to get the three modules above working, I managed to get git commands working via the subprocess module, e.g.

def gitAdd(fileName, repoDir):
    cmd = ['git', 'add', fileName]
    p = subprocess.Popen(cmd, cwd=repoDir)
    p.wait()

gitAdd('exampleFile.txt', '/usr/local/example_git_repo_dir')

This isn't fully incorporated into my program yet, but I'm not anticipating a problem, except maybe speed (since I'll be processing hundreds or even thousands of files at times).

Maybe I just didn't have the patience to get things going with Dulwich or GitPython. That said, I'm hopeful the modules will get more development and be more useful soon.

Tidhar Klein Orbach
  • 2,896
  • 2
  • 30
  • 47
PTBNL
  • 6,042
  • 4
  • 28
  • 34
32

I'd recommend pygit2 - it uses the excellent libgit2 bindings

Alex Chamberlain
  • 4,147
  • 2
  • 22
  • 49
tamale
  • 329
  • 3
  • 2
  • 1
    It gives best access to git plumbing also. – pielgrzym Jun 24 '12 at 11:04
  • `pygit2` is a really useful library, and I look forward to it expanding in the future! – Alex Chamberlain Aug 23 '12 at 14:20
  • 2
    As it is now, one must manually download and compile/setup semi-stable versions of both `libgit` and `pygit2`, taking the source from GitHub. Problem is, head branches have broken tests, and latest "stable" fail installation... Not a suitable solution if reliability is important and you need to deploy in a variety of environments... :( – mac Dec 04 '12 at 10:32
  • 1
    stay away from this combination if you ever plan on clients using cygwin. pygit2 is a wrapper for libgit2 and libgit2 has dropped all cygwin support. The comment I got from one of the dev's, "You can try, but it be a miracle if it builds" beautiful API, yes, but half my clients are cygwin therefore I can't use it. Probably going to GitPython. – scphantm Sep 23 '13 at 17:55
  • With Homebrew on Mac, I found using this to be delightful. `brew install libgit2; pip install pygit2` – geowa4 May 10 '15 at 02:36
  • 2
    Note that they don't support cygwin because [their focus is on native Windows support instead](https://github.com/libgit2/libgit2/issues/476#issuecomment-2697761). So while it's correct that libgit2 isn't supported on cygwin, it *doesn't* mean that Windows users are left out in the cold. – Xiong Chiamiov Dec 09 '16 at 01:13
19

Maybe it helps, but Bazaar and Mercurial are both using dulwich for their Git interoperability.

Dulwich is probably different than the other in the sense that's it's a reimplementation of git in python. The other might just be a wrapper around Git's commands (so it could be simpler to use from a high level point of view: commit/add/delete), it probably means their API is very close to git's command line so you'll need to gain experience with Git.

Will Hardy
  • 14,588
  • 5
  • 44
  • 43
tonfa
  • 24,151
  • 2
  • 35
  • 41
19

This is a pretty old question, and while looking for Git libraries, I found one that was made this year (2013) called Gittle.

It worked great for me (where the others I tried were flaky), and seems to cover most of the common actions.

Some examples from the README:

from gittle import Gittle

# Clone a repository
repo_path = '/tmp/gittle_bare'
repo_url = 'git://github.com/FriendCode/gittle.git'
repo = Gittle.clone(repo_url, repo_path)

# Stage multiple files
repo.stage(['other1.txt', 'other2.txt'])

# Do the commit
repo.commit(name="Samy Pesse", email="samy@friendco.de", message="This is a commit")

# Authentication with RSA private key
key_file = open('/Users/Me/keys/rsa/private_rsa')
repo.auth(pkey=key_file)

# Do push
repo.push()
gak
  • 32,061
  • 28
  • 119
  • 154
  • 3
    i don't like that you "stage" files instead of "add" them to the index. changing names of common/important operations just seems like it would be confusing. – underrun Feb 05 '14 at 21:53
  • 3
    @underrun adding is adding files to the stage. Isn't that the same with staging files ? – Jimmy Kane Feb 06 '14 at 13:12
  • adding files is staging files to be committed (it is adding them to the index). the operation is the same but at the command line you would type `git add other1.txt other2.txt` so it doesn't follow what would be expected. – underrun Feb 06 '14 at 15:22
  • 1
    Agreed on the superiority of this package. I've even been able to use it within the Pythonista app after installing StaSh, which it was packaged with. Also, it is worth noting that your answer is the most recently updated out of the answers to this question. – Chris Redford Apr 09 '16 at 18:30
  • 1
    Actually, it seems to *only* work for me on Pythonista. Getting it to password authenticate a clone of a private bitbucket repo on my Mac was a nightmare I finally gave up on. – Chris Redford Apr 09 '16 at 22:00
  • Final update: can't even get commits working in Pythonista. Conclusion: there is no reliable way to have a private bitbucket git ecosystem in Pythonista. – Chris Redford Apr 09 '16 at 22:18
  • Exactly what I wanted but no proper python3 support as of now – Stephan Schielke Oct 24 '17 at 15:30
  • Nice options. [The project](https://pypi.org/project/gittle/) is already [Upgrades setup for Py3 conversion and runs 2to3](https://github.com/FriendCode/gittle/commit/01ef268a57194b8ab82865e7388a0bdd99f3fd48). – eQ19 Jul 04 '19 at 13:39
7

An updated answer reflecting changed times:

GitPython currently is the easiest to use. It supports wrapping of many git plumbing commands and has pluggable object database (dulwich being one of them), and if a command isn't implemented, provides an easy api for shelling out to the command line. For example:

repo = Repo('.')
repo.checkout(b='new_branch')

This calls:

bash$ git checkout -b new_branch

Dulwich is also good but much lower level. It's somewhat of a pain to use because it requires operating on git objects at the plumbing level and doesn't have nice porcelain that you'd normally want to do. However, if you plan on modifying any parts of git, or use git-receive-pack and git-upload-pack, you need to use dulwich.

Jon Chu
  • 1,877
  • 2
  • 20
  • 19
7

For the sake of completeness, http://github.com/alex/pyvcs/ is an abstraction layer for all dvcs's. It uses dulwich, but provides interop with the other dvcs's.

Justin Abrahms
  • 1,289
  • 1
  • 12
  • 21
2

PTBNL's Answer is quite perfect for me. I make a little more for Windows user.

import time
import subprocess
def gitAdd(fileName, repoDir):
    cmd = 'git add ' + fileName
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 

def gitCommit(commitMessage, repoDir):
    cmd = 'git commit -am "%s"'%commitMessage
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 
def gitPush(repoDir):
    cmd = 'git push '
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    pipe.wait()
    return 

temp=time.localtime(time.time())
uploaddate= str(temp[0])+'_'+str(temp[1])+'_'+str(temp[2])+'_'+str(temp[3])+'_'+str(temp[4])

repoDir='d:\\c_Billy\\vfat\\Programming\\Projector\\billyccm' # your git repository , windows your need to use double backslash for right directory.
gitAdd('.',repoDir )
gitCommit(uploaddate, repoDir)
gitPush(repoDir)
Billy Jin
  • 59
  • 2
1

Here's a really quick implementation of "git status":

import os
import string
from subprocess import *

repoDir = '/Users/foo/project'

def command(x):
    return str(Popen(x.split(' '), stdout=PIPE).communicate()[0])

def rm_empty(L): return [l for l in L if (l and l!="")]

def getUntracked():
    os.chdir(repoDir)
    status = command("git status")
    if "# Untracked files:" in status:
        untf = status.split("# Untracked files:")[1][1:].split("\n")
        return rm_empty([x[2:] for x in untf if string.strip(x) != "#" and x.startswith("#\t")])
    else:
        return []

def getNew():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tnew file:   ")]

def getModified():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tmodified:   ")]

print("Untracked:")
print( getUntracked() )
print("New:")
print( getNew() )
print("Modified:")
print( getModified() )
Shane Geiger
  • 145
  • 1
  • 4
0

The git interaction library part of StGit is actually pretty good. However, it isn't broken out as a separate package but if there is sufficient interest, I'm sure that can be fixed.

It has very nice abstractions for representing commits, trees etc, and for creating new commits and trees.

dkagedal
  • 578
  • 2
  • 7
  • 14
-3

For the record, none of the aforementioned Git Python libraries seem to contain a "git status" equivalent, which is really the only thing I would want since dealing with the rest of the git commands via subprocess is so easy.

xdissent
  • 949
  • 7
  • 7