0

I have somehow manage to create a duplicate of every commit I have made to a repository in an unnamed branch, as shown below on GitHub's network graph:

enter image description here

What can I do to clean this up and remove the duplicated commits? I have looked at this post here, but it seems to apply more to individual duplicates than to an entire duplicated branch.

Additional information: I am the sole owner and user of the repository. The issue first arose when I made a commit from a new computer, but did not properly set my username and email address. I rectified this commit using the methods posted here on GitHub. After doing this, I was prompted to pull from the remote repository to my local repository before pushing some additional changes I had made, at which point it pulled the set of duplicate commits.

Community
  • 1
  • 1
hfisch
  • 1,312
  • 4
  • 19
  • 36
  • Did you look this post ? [link](http://stackoverflow.com/questions/2003505/how-to-delete-a-git-branch-both-locally-and-remotely) – Murat Gündeş Feb 26 '17 at 19:24
  • @MuratGündeş I'm not sure how to implement that given that right now the only branch that exists, remote or local, is the `master` branch. – hfisch Feb 27 '17 at 00:29

2 Answers2

1

If by unnamed branch you mean dangling commits then git gc --prune=all should do the trick. But before running this command, make sure you don't have any of them, because this action is irreversible.

More on git gc

Then you can execute git push --prune to remove remotes without a local counterpart.

mshrbkv
  • 309
  • 1
  • 5
1

In one sense, Git does not have unnamed branches at all. In another sense, it has an infinite number of unnamed branches. Either way it makes little sense to attempt to display all of them1—and a GitHub network graph doesn't. That's not what a GitHub network graph is.

This is what a GitHub network graph is. It attempts (not entirely successfully, in my opinion :-) ) to display the union of multiple related (by GitHub-fork, I presume) repositories. Thus, what you are seeing is not necessarily anything in your own repositories, but rather across several different repositories, maybe including some of your own, but maybe literally everything is in several other repositories entirely. It might be in your repository, but it might not be. There is not enough information in your existing question to diagnose this

Since Git largely works by adding new commits (while keeping every old commit, through what Git calls references), most things you do will simply add more commits, and perhaps even more lines, to your network graph. ElpieKay's answer to your linked question works by copying commits, which would—and does—add another line, but making sure that the first copied commit has a link back to some existing commit (that's the --onto target of the rebase) and then removing the name that Git used to find the original chain of commits (that's the final step of git rebase itself).

In other words, this (at least conceptually—ElpieKay's answer is specific to one particular situation, ending in a merge commit, that I'm not drawing here) takes:

o--o--o--o   <-- main-branch

*--*--*   <-- side-branch

and first adds a copy of the bottom row, but with a link back to the top:

o--o--o--o   <-- main-branch
          \
*--*--*    *--*--*   <-- HEAD

(I omitted the side-branch label partly to save space, and partly because it sits "under" the copied commits: it still points to the last of the original three commits, though.) Once the copying is done, we "forget" the original three by changing just the one branch label, giving:

o--o--o--o   <-- main-branch
          \
           *--*--*   <-- side-branch

The three original commits are forgotten precisely because they no longer have a name, side-branch, pointing to their tip (latest) commit.

One can accomplish this kind of thing with two different Git commands, which have different aims:

  • git rebase copies some (usually small and simple) set of commits, preferably a simple linear chain, to a new set of commits, also changing the stored snapshot for each copied commit. This is essentially equivalent to doing a series of git cherry-picks on each to-be-copied commit. Then, as we just saw, it moves the branch label so that instead of pointing to the last of the pre-copy commits, it points to the last of the post-copy commits.

  • git filter-branch copies an arbitrary set of commits, including merges if you like, to a new set. During this copying, each extracted commit snapshot is modified according to / using arbitrary filtering of your choice. This may include omitting some commits, creating new commits, changing the parent linkages of commits, and/or making arbitrary changes to the snapshot. This is therefore much more flexible than rebase, but also much much more difficult to use (and not well suited to the kind of source-tree modification rebase is specifically designed for, either).

Whether any of this will help depends on what GitHub is showing you. Because a network graph is not limited to your repository, there is no guarantee that doing anything within your repository will improve anything you see.


1Every commit can be treated as an anonymous branch, or perhaps even an infinite number of anonymous branches (though the latter is not useful, while the former sometimes is).

One time when it does make sense to show an anonymous branch is when looking at a repository with a "detached HEAD". This is because a detached HEAD literally means that the name HEAD resolves to one specific commit, making that one specific commit act like a branch-tip commit. In effect, HEAD is the name of this branch. But HEAD is supposed to move, or be "re-attached". (Your HEAD is there to move you around.) As soon as you move it, the commit stops being the tip of that anonymous branch.

In the steps shown above for rebase, I show HEAD pointing to the copied tip commit. That is in fact how Git does the rebase, using HEAD as the tip of an anonymous branch. The last step, moving the original branch's name, cements the copies into place permanently (permanent, that is, until you rebase again, or delete the name entirely).

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • if it makes things more clear, I've posted some additional information on the repository and how it got to the point that it's at right now. – hfisch Feb 27 '17 at 00:04
  • Ah, that definitely helps. When you ran `git filter-branch` you copied *every* commit, altering them during the copy (in that period between extracting the original and making the one ones). Then you ran `git pull`, and `git pull` is always a bad idea. :-) In Git, pull means "fetch then merge": so you picked up all your original commits (well, you still had them, just "virtually" picked up) and then *combined* your new copies with your originals. You must now decide which set of commits to keep, and which get their name(s) deleted, *and* you must discard the merge and copy subsequent commits. – torek Feb 27 '17 at 00:58
  • The decision on which to keep is going to be based on which have the right author/committer names. Discarding merges while keeping subsequent commits is usually easiest using `git rebase`. The precise command(s) to use depends on your repository. – torek Feb 27 '17 at 00:59
  • Alright. It's easy enough to see which set to keep and which to remove; I'm trying to use `git rebase -i`, but I get a merge conflict with every change, whether its a squash or drop. You say that the precise command depend on my repository; how can I determine that? – hfisch Feb 27 '17 at 04:25
  • Use `git log --graph` to get Git to draw the graph for you (I like to use, as one put it, A DOG: `git log --all --decorate --oneline --graph`, here). Note that `git rebase -i --onto ` will copy commits from `` through `HEAD`, excluding any merges, to go after the named ``. – torek Feb 27 '17 at 06:57
  • Alright... I've set my `` as the first commit in the set of commits that I want to overwrite, and the `` as the first commit in the set of commits that I want to keep, with both `` and `` being duplicate commits of each other. When I try to rebase this, I get a very long list of working tree files that would be overwritten by checkout, and the statement `Please move or remove them before you switch branches.` The rebase then aborts. The list appears to include every single file in the repository; what am I doing wrong now? – hfisch Feb 28 '17 at 01:16
  • I'm not sure off-hand exactly what's wrong, but this part certainly *sounds* wrong: "I've set my target as the first commit in the set of commits that I want to overwrite". Rebase doesn't *overwrite* anything, it copies; the first copied commit has the `--onto` target as its parent, which means that this target commit is the commit that comes just before the new chain. It's a good idea to draw the graph, identify the commits you want to copy, and then do "git log limit..HEAD" to make sure those match up. Then the `--onto` is simply the last (most tip-ward) commit to *keep*. – torek Feb 28 '17 at 01:30