3

Im using GitPython to clone a repo within my program. I figured how to display the the status of the clone with the clone_from command but I want the status to look more like a tqdm progress bar. I tried using the requests library to get the size of the file but I'm still unsure how to implement it. Tried doing something like this below but it's not working. Any help appreciated, thanks.

url = 'git@github.com:somegithubrepo/repo.git'
r = requests.get(url, stream=True)
total_length = r.headers.get('content-length')

for i in tqdm(range(len(total_length??))):
    git.Git(pathName).clone(url)

4 Answers4

6

This is an improved version of the other answer. The bar is created only once when the CloneProgress class is initialized. And when updating, it sets the bar at the correct amount.

import git
from git import RemoteProgress
from tqdm import tqdm

class CloneProgress(RemoteProgress):
    def __init__(self):
        super().__init__()
        self.pbar = tqdm()

    def update(self, op_code, cur_count, max_count=None, message=''):
        self.pbar.total = max_count
        self.pbar.n = cur_count
        self.pbar.refresh()

git.Repo.clone_from(project_url, repo_dir, branch='master', progress=CloneProgress()
Cosmos Zhu
  • 128
  • 1
  • 10
2

This answer provides a working example based on rich's progress bar, with one progress bar and a new task dispatched for each step of the cloning process. This answer is quite similar to my other one, but I think it helps readability to have them in 2 separate posts. The main difference is that as rich allows to have multiple tasks within one progress bar context, the context needs to be entered and exited only once - and not for each stage.

from __future__ import annotations

import git
from rich import console, progress


class GitRemoteProgress(git.RemoteProgress):
    OP_CODES = [
        "BEGIN",
        "CHECKING_OUT",
        "COMPRESSING",
        "COUNTING",
        "END",
        "FINDING_SOURCES",
        "RECEIVING",
        "RESOLVING",
        "WRITING",
    ]
    OP_CODE_MAP = {
        getattr(git.RemoteProgress, _op_code): _op_code for _op_code in OP_CODES
    }

    def __init__(self) -> None:
        super().__init__()
        self.progressbar = progress.Progress(
            progress.SpinnerColumn(),
            # *progress.Progress.get_default_columns(),
            progress.TextColumn("[progress.description]{task.description}"),
            progress.BarColumn(),
            progress.TextColumn("[progress.percentage]{task.percentage:>3.0f}%"),
            "eta",
            progress.TimeRemainingColumn(),
            progress.TextColumn("{task.fields[message]}"),
            console=console.Console(),
            transient=False,
        )
        self.progressbar.start()
        self.active_task = None

    def __del__(self) -> None:
        # logger.info("Destroying bar...")
        self.progressbar.stop()

    @classmethod
    def get_curr_op(cls, op_code: int) -> str:
        """Get OP name from OP code."""
        # Remove BEGIN- and END-flag and get op name
        op_code_masked = op_code & cls.OP_MASK
        return cls.OP_CODE_MAP.get(op_code_masked, "?").title()

    def update(
        self,
        op_code: int,
        cur_count: str | float,
        max_count: str | float | None = None,
        message: str | None = "",
    ) -> None:
        # Start new bar on each BEGIN-flag
        if op_code & self.BEGIN:
            self.curr_op = self.get_curr_op(op_code)
            # logger.info("Next: %s", self.curr_op)
            self.active_task = self.progressbar.add_task(
                description=self.curr_op,
                total=max_count,
                message=message,
            )

        self.progressbar.update(
            task_id=self.active_task,
            completed=cur_count,
            message=message,
        )

        # End progress monitoring on each END-flag
        if op_code & self.END:
            # logger.info("Done: %s", self.curr_op)
            self.progressbar.update(
                task_id=self.active_task,
                message=f"[bright_black]{message}",
            )

Using it - full clone:

project_url = "https://github.com/u-boot/u-boot"

print("Cloning Git Repository 'u-boot' ('master' branch)...")
git.Repo.clone_from(
    url=project_url, 
    to_path="u-boot",
    progress=GitRemoteProgress(),
)
print("Done.")

Progress bar in action - full clone

Using it - shallow clone:

project_url = "https://github.com/u-boot/u-boot"

print("Cloning Git Repository 'u-boot' ('master' branch)...")
git.Repo.clone_from(
    url=project_url, 
    to_path="u-boot",
    depth=1,
    progress=GitRemoteProgress(),
)
print("Done.")

Progress bar in action - shallow clone


PS: u-boot repo used due to its large size and therefore followable cloning progress

lcnittl
  • 233
  • 1
  • 14
1

You can try something like:

    import git
    from git import RemoteProgress
    from tqdm import tqdm
    
    
    class CloneProgress(RemoteProgress):
        def update(self, op_code, cur_count, max_count=None, message=''):
            pbar = tqdm(total=max_count)
            pbar.update(cur_count)
    
    git.Repo.clone_from(project_url, repo_dir, branch='master', progress=CloneProgress()
-1

I know the question seems to mainly focus on using tqdm, however, if one wanted to use alive-progress progress bar, here is a working example that dispatches a progress bar for each step of the cloning process:

from __future__ import annotations

import git
from alive_progress import alive_bar


class GitRemoteProgress(git.RemoteProgress):
    OP_CODES = [
        "BEGIN",
        "CHECKING_OUT",
        "COMPRESSING",
        "COUNTING",
        "END",
        "FINDING_SOURCES",
        "RECEIVING",
        "RESOLVING",
        "WRITING",
    ]
    OP_CODE_MAP = {
        getattr(git.RemoteProgress, _op_code): _op_code for _op_code in OP_CODES
    }

    def __init__(self) -> None:
        super().__init__()
        self.alive_bar_instance = None

    @classmethod
    def get_curr_op(cls, op_code: int) -> str:
        """Get OP name from OP code."""
        # Remove BEGIN- and END-flag and get op name
        op_code_masked = op_code & cls.OP_MASK
        return cls.OP_CODE_MAP.get(op_code_masked, "?").title()

    def update(
        self,
        op_code: int,
        cur_count: str | float,
        max_count: str | float | None = None,
        message: str | None = "",
    ) -> None:
        cur_count = float(cur_count)
        max_count = float(max_count)

        # Start new bar on each BEGIN-flag
        if op_code & self.BEGIN:
            self.curr_op = self.get_curr_op(op_code)
            self._dispatch_bar(title=self.curr_op)

        self.bar(cur_count / max_count)
        self.bar.text(message)

        # End progress monitoring on each END-flag
        if op_code & git.RemoteProgress.END:
            self._destroy_bar()

    def _dispatch_bar(self, title: str | None = "") -> None:
        """Create a new progress bar"""
        self.alive_bar_instance = alive_bar(manual=True, title=title)
        self.bar = self.alive_bar_instance.__enter__()

    def _destroy_bar(self) -> None:
        """Destroy an existing progress bar"""
        self.alive_bar_instance.__exit__(None, None, None)

Using it full clone:

project_url = "https://github.com/u-boot/u-boot"

print("Cloning Git Repository 'u-boot' ('master' branch)...")
git.Repo.clone_from(
    url=project_url, 
    to_path="u-boot",
    progress=GitRemoteProgress(),
)
print("Done.")

Progress bar in action - full clone

Using it - shallow clone:

project_url = "https://github.com/u-boot/u-boot"

print("Cloning Git Repository 'u-boot' ('master' branch)...")
git.Repo.clone_from(
    url=project_url, 
    to_path="u-boot",
    depth=1,
    progress=GitRemoteProgress(),
)
print("Done.")

Progress bar in action - shallow clone


PS: u-boot repo used due to its large size and therefore followable cloning progress
PPS: The recording uses rich's print, hence the fancy colors :-)

lcnittl
  • 233
  • 1
  • 14
  • This example is no longer working. `TypeError: unsupported operand type(s) for /: 'float' and 'str'` – Andrea Ricchi Jun 20 '22 at 22:20
  • @AndreaRicchi Which OS and which Python version? – lcnittl Jun 22 '22 at 05:04
  • So this is the error: `File "gitremoteprogress.py", line 46, in update` ` self.bar(cur_count / max_count)` `TypeError: unsupported operand type(s) for /: 'float' and 'str'` on Ubuntu 20.04 with Python 3.8.10. – Andrea Ricchi Jun 23 '22 at 07:23
  • Interesting. For me it runs on Windows 11 from Python 3.7–3.10 and on Ubuntu 20.04 with Python 3.8. In any case, can you try the updated version? – lcnittl Jun 25 '22 at 11:28
  • It would be interesting which value `max_count` has in your case so that it is not given as float. – lcnittl Jun 25 '22 at 12:02