0

I have two local repos on my computer: A & B. I need to copy A to B, so that all branches are copied, but only the latest 5 commits history.

When copying, I need any branch named master to be renamed to main. Each branch then needs reconnected with the remote so that I can publish the branches to Github.com.

I have attempted to write a bash script to handle this, but the problem is that the branches lose the connection to the remote and cannot be published.

Github for Desktop Error:

Unable to push commits to this branch because there are commits not he remote that are not present on your local branch. Fetch these new commits before pushing in order to reconcile them with your local commits.

My .sh file:

#!/bin/bash
set -e

repo_name=TestRepo
dir=/Users/username/development/$repo_name
repo_src_url=git@github.com:my_username/$repo_name.git
repo_target_url=git@github.com:my_username/$repo_name.git

git clone --no-single-branch --depth 5 $repo_src_url $dir \
  && cd $dir \
  && git branch -m master main \
  && git commit --amend --author "John Smith <john@johnsmith.com>" -m "Initial Commit" \
  && git branch --unset-upstream \
  && git remote set-url origin $repo_target_url
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
stwhite
  • 3,156
  • 4
  • 37
  • 70
  • 1
    Totally unclear. What would multiple branches but no history even mean? You just want a collection of unrelated commits floating in a kind of virtual limbo? – matt Jul 30 '22 at 19:03
  • @matt all branches of repo should be included, but either no commit history or at the very least the latest commit. – stwhite Jul 30 '22 at 23:40
  • 1
    You may find it easier to 1) create an empty repo 2) check out a branch in the old repo 3) create same branch in the new repo 4) copy all files to the empty repo 5) add and commit them … repeat for all branches, ensuring to create main first and branches off main. There are existing questions with some more detailed answers (e.g. search for “squash all history”) but ultimately what you are asking for is sufficiently unusual to be unique afaik. – AD7six Jul 31 '22 at 08:51

1 Answers1

1

I'll put this up top here, but per your comments below, you've started out with the wrong question (even after updating it). What you want is described as follows:

  • There is some existing repository (and/or clones of it but we care about a particular repo on GitHub). It has N1 > 1 branch names in it, selecting N2 > 1 tip commits. One of these branch names is or may be master.

  • We'd like to clone this repository and create, locally, all N1 branch names, selecting all N2 commits, while changing the name master to main.

  • Then we'd like to rewrite all the commits so that each branch has a history whose depth is no more than 5 commits, but all the tip commits retain any relative relationship they had before.

  • We will then push this to GitHub, into a currently-empty repository.

  • It's not clear what, if anything, we wish to do with all the tag names.

(Then we want to repeat this for more repositories.)

There is in fact no fully general solution to this problem as stated because it may require more than 5 or so commits on each branch, depending on relationships. For a trivial example, consider the following graph:

A   <-- br1
 \
  B   <-- br2
   \
    C   <-- br3
     \
      D   <-- br4
       \
        E   <-- br5
         \
          F   <-- master

Here, to keep the relationships between commits intact, we must in fact copy all six commits (and hence there's no history rewriting involved in the end, in this example; we would just change the name master to main and be done with things). Each of the six commits is a tip commit on a branch: A, the root commit of the repository, is the tip commit of br1, B is the tip of br2, and so on through F, which is the tip of master. But the final depth of main (renamed from master) will be 6, not 5.

Fortunately, the "depth no more than 5 commits" constraint appears to be very weak here: we'd just like to perhaps retain a few extra commits in some cases. For instance, given this repository:

A--B--C--D--E--F--G   <-- br1
                   \
                    H--I--J--K--L--M--N   <-- master

we might produce, as our revised history:

C'-D'-E'-F'-G'  <-- br1
             \
              J'-K'-L'-M'-N'  <-- main

That is, we've shrunken the depth of br1 to 5, and retained just five extra commits on main: commit C' is the effect of squashing A-B-C together, and J' is the effect of squashing H-I-J together.

This is solveable. However, there's no tool included with Git itself that solves this problem. What we'll need is a program like git filter-repo (this will be easier to use, probably) or git filter-branch; we will need to write code. How much code and how complex it gets is up to you, but here's the basic task:

  • enumerate each labeled commit, so that we know which commits must be retained;
  • traverse the entire graph: for each labeled commit, retain it and (optionally) some depth of previous unlabeled commits, up to whatever limit you like.
  • Retain (as a new root) any commit required to keep branch relationships intact, even if it would not be retained by the previous rules.

(Note: for the "enumerate labeled commits" step, git for-each-ref does the job. To handle the second bullet point, consider running git rev-list -n 5 on each hash ID from the first step. I'm not sure off-hand how to do the third step, but git merge-base --all might well be the ticket.)

That is, given a graph like this:

     C--...--H  <-- master
    /
A--B
    \
     I--...--Z  <-- br1

we must at least retain B because it's why H and Z are related at all. We need not retain C or I; we just want whatever commit we do retain behind H to connect back to B, and whatever commit we do retain behind Z to connect back to C as well, so that we get, e.g.:

  D'-E'-F'-G'-H'  <-- main
 /
B'
 \
  V'-W'-X'-Y'-Z'   <-- br1

Now that we know which commits to retain, traverse the entire graph again: copy commits marked "retain" and drop all the others, using the usual filter-branch / filter-repo rules.

(Along the way, we can consider updating author and/or committer name and/or email as well, if we like.)

We must make one more decision: if the input repository has both master and main, what should we do with the old main? You can simply forbid it (make this an error) unless and until it comes up.

You'll probably want to use git clone --mirror to make each input to be filtered with git filter-repo or whatever program you come up with here, and you'll probably want to use git push --mirror to push the output repository from git filter-repo.

Writing the filtering code for git filter-repo (or writing your own program or coming up with some fancy script for git filter-branch) is going to be nontrivial.


Original answer to original question

The git clone command never copies any branches. Instead, it copies all branches.

This seems self-contradictory. How can not copying branches copy branches? It is true, and it's also self-contradictory. It occurs because the word branch doesn't have a meaning in Git.

More precisely, branch does not have one meaning. Branch has more than one meaning. So git clone copies one kind of branch, and does not copy another kind of branch. The number of meanings for the word branch depends in part on how you count, and in part on what you choose to count. See also What exactly do we mean by "branch"? (which discusses the correct question to think about).

Now, the real problem you're having appears to be the fact that—per this GitHub Desktop issue—GitHub Desktop simply does not support shallow clones in the first place. (Given that the issue was raised in 2017 and closed as "something we don't intend to support" in 2021, there seems to be no urgency here, so I doubt this situation has changed.) So while --no-single-branch is the correct method to defeat the default --single-branch action of --depth 5, the actual problem here is your use of --depth 5 at all, given that you're also using GitHub Desktop.

Your two options are therefore:

  1. Don't use GitHub Desktop.
  2. Do a full clone.

Remember that when git clone makes a new clone, it:

  • changes all the other clone's branch names into remote-tracking names, and then
  • creates one new branch that's specific to your own repository.

You'll need to use git switch or git checkout or something to create more local branch names if you wish to have more local branch names (though your script suggests that you only really care about one branch name anyway).

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you for the clarification. I care about all branches, but I am trying to rename any branch named "master" to "main" because all default branches in the destination repos are named "main". My entire goal here is to remove the commit history. If I was able to do that on the original repo then I would just clone without `--depth`. Is this possible? – stwhite Jul 31 '22 at 03:12
  • Ah: if you want to "delete (some) history" you *must* **rewrite** history. History always goes all the way back to the beginning: the *root* commit (or one or more root commits), which are commits that have no parents. A shallow clone is one that has one or more "graft points" at which Git can see what there's some previous commit(s), but also that they are deliberately not included; such a repository has special restrictions. – torek Jul 31 '22 at 03:17
  • I would like to explicitly remove all the commit history for the repo while maintaining all branches. That is my sole objective here but as far as I could tell there wasn't a way to do that... – stwhite Jul 31 '22 at 03:18
  • You'll need to decide how much history you *do* want, because branches are just *names* that hold specific commit hash IDs. It's the *commits* that are the history, and that have relationships to each other, so if you want to have N>1 branches that are *related* they must specify final commits that in turn are related via their commit history. That means you'll want a `git filter-branch` or `git filter-repo` type of operation, or something klunkier but faster. Exactly what depends on what result you want. – torek Jul 31 '22 at 03:19
  • 1
    In short, it turns out you're starting with the wrong question. – torek Jul 31 '22 at 03:21
  • At this point I would settle for removing all history. I'm looking for a blanket solution because there are 8 different repos to do this to. – stwhite Jul 31 '22 at 03:22
  • 1
    I'm writing up a revised version of your question, which I will attempt to answer... – torek Jul 31 '22 at 03:26