I am trying to convince my colleague that using a subprocess for getting a repo head is bad because spawning a subprocess or creating a process has a lot of overhead. To convince him, I created two scripts and profiled them, but the results were not what I was expecting(python-git will be faster than subprocess).
This is the first script - test_git_module.py
which I profiled
import git
def test():
repo = git.Repo(".", search_parent_directories=True)
test()
After profiling this with cProfile - python3 -m cProfile test_git_module -s
, the output I got was 78059 function calls (75806 primitive calls) in 0.130 seconds
On the other hand, when I profiled the script test_subprocess.py
the output was
6529 function calls (6430 primitive calls) in 0.017 seconds
test_subprocess.py
import subprocess
import os
import sys
def test():
SELF_DIRPATH = os.path.dirname(__file__)
WORKSPACE_DIRPATH = (
subprocess.run(["git", "rev-parse", "--show-toplevel"], stdout=subprocess.PIPE, check=True)
.stdout.decode(sys.stdout.encoding)
.strip()
)
test()
So, clearly in this python-git is not at all helping and it is the one which is really slow for doing such kind of tasks. This brings me to the question that when and why should anyone use Python-GIT over a subprocess?