-1

Since nowhere anybody is able to explain the objects involved in git I'm unable to resolve my questions by myself (in contrary to other source code control systems).

I've started working in some cloned repository created by git clone repository (this is why I below mention "unnamed branch"):

1) I performed changes

2) I did git add and git commit

3) I decided that I would like to back this up without disturbing my colleagues. So I did git branch SomeUniqueName git checkout SomeUniqueName. Now I wonder what will happen if I do a git push. If the changes do not end up in the specified branch (SomeUniqueName) but in the unnamed branch, how can I change this? I was already trying to work with another cloned repository and copying my changes. But in this case I don't know how to make the two repositories match the same starting point (in time) -- other developers might have changed the repository meanwhile rendering my changes not working.

Somebody claims, that this question is a duplicate of another question. I cannot relate to this other question, as I already don't know how to extract the list of changes from git. I suspect that this relates to list of commits and that it can be done with "git log" (btw -- when looking at the man page of git-log I'm confused by most of the explained arguments). But when doing a "git log" I get commits performed by other people which did not have access to my working directory. So somehow I remember (When I need my memory, what is the need for a SCCS running on a copputer?) that all the changes performed by me on the top are changes performed only locally in my working directory. So I guess I'll try git reset --hard HEAD~8; git checkout SomeUniqueName AFTER I performed a backup of my working directory. And voila -- all my changes are gone. So I've to unpack the just made backup and copy over my changes by hand.

  • Which branch did you commit in? The backup only added another pointer, unless you reset the previous branch to an earlier commit. – evolutionxbox Mar 30 '18 at 14:28
  • @evolutionxbox I specified the sequence of commands already. So the commits were performed in the unnamed branch. Will they end up in this unnamed branch? If so, how can I move these changes to the branch I want them in? –  Mar 30 '18 at 14:40
  • You need to start with a good tutorial or book on how Git represents commits in a Directed Acyclic Graph, how it uses branch names to store commit hash IDs, and how `git fetch` and `git push` deal with the branch names vs the commit hash IDs. Git's notions are very different from other VCSes. Are you familiar with graph theory? If so, start with http://eagain.net/articles/git-for-computer-scientists/ – torek Mar 30 '18 at 14:41
  • @torek I don't have a problem understanding graph theory. I just don't know which objects are modified by which git commands. Or even better: I don't know which objects are stored in a local git repository. –  Mar 30 '18 at 14:44
  • Can you be specific in what you mean by objects? – evolutionxbox Mar 30 '18 at 14:48
  • OK: so, if you've gone through the link above, you now know which objects are used to store each part. The ID of each object is its hash (SHA-1, at least for now). Normally, every repository stores *every* object required to keep the graph complete, and each external name points to some existing commit object. (Well, let me switch to answer rather than comment...) – torek Mar 30 '18 at 14:49
  • @torek -- so what -- changes are stored as a linked list of changes. I already guessed that changes somehow need to be stored in both the local repository and the remote one. This also does not tell me, how to deduce in which branch my changes will be pushed to. –  Mar 30 '18 at 14:52
  • 1
    Sounds like you haven't tried it yet? Whatever branch you make the commit on, is the branch that will get the changes when you push to the remote. The most straightforward way, if you want those changes on another branch, is to revert the commit, create a new branch, make the changes on that branch, and commit them there. – rogerrw Mar 30 '18 at 14:54
  • Possible duplicate of [Move the most recent commit(s) to a new branch with Git](https://stackoverflow.com/questions/1628563/move-the-most-recent-commits-to-a-new-branch-with-git) – Stefan Crain Mar 30 '18 at 14:56
  • @evolutionxbox "The backup only added another pointer" -- you lost me. –  Mar 30 '18 at 14:57
  • A branch is just a named reference to an object. – evolutionxbox Mar 30 '18 at 15:00

2 Answers2

0

OK, assuming you're good with graph theory and have read through Git for Computer Scientists:

  • All graph nodes are identified by hash ID. (The hashes are currently SHA-1s over the object contents prefixed by type-plus-size-plus-a-NUL-byte, not that this matters too much except for correctness.)
  • Except for so-called shallow clones, which we'll ignore here, every Git repository has every reachable commit and every reachable object underneath that commit.
  • Commits become reachable by having a branch name, or tag name, or any other externally-sourced name, from a second database of reference names. Branch names must point to commit objects. (Other names, especially tag names, can point to other object types, but again that's not important for our purposes here.) Branch names are the most interesting case here since that's how we'll build commits and transfer them from one repository to another—but there's a second kind of name, the remote-tracking name, that is also key.
  • Hence we can draw the commit graph like this for a simple linear case:

    A  <-B  <-C   <--master
    

    The external name master contains the hash ID of commit C. This commit object itself contains the hash ID of commit B; B contains the hash ID of A. A has no outgoing arcs (no parents as Git puts it) so it is a root commit and the action all stops here.

  • All objects are read-only at all times. Unreachable objects can be garbage-collected.

Note that no internal object has the ID of C, so the fact that C is reachable, and hence the chain is retained, only occurs because of the external name refs/heads/master (branch master).

If we add a new branch name, such as dev, we get:

A--B--C   <-- master, dev

(the internal arrows in the graph are all still backwards because of the read-only nature of the commit objects, but this gets too painful to draw in text, so I don't bother). Now Git needs a way to know which name to adjust when making new commits, so it attaches the label HEAD to some branch name. HEAD can only be attached to a branch name! Let's draw that in:

A--B--C   <-- master, dev (HEAD)

To make a new commit, Git packages up all the blob hashes stored in the index (which we've skipped over here) into a new tree object and then creates a new commit object pointing to the tree, as in discussed in the "Git for CSists" link above. The new commit will point back to whichever commit HEAD indirectly points to:

A--B--C
       \
        D

and then Git will just overwrite whichever name HEAD is attached to, so that the names are:

A--B--C   <-- master
       \
        D   <-- dev (HEAD)

Push and fetch transfer objects, then set names

We now have almost everything we need to understand both git push and git fetch. Let's look at push first since you're more concerned with it.

Your Git will call up some other Git and hand over a hash ID, such as that for your new commit D. The other Git has no name for this ID yet, it just checks to see if it has the hash ID. If not, it needs the commit (and probably the tree and blob objects as well), so it says "send me those objects". Your Git packs them up and sends them over. They put the commit D and its sub-objects into their graph, but as yet they have no name for this object.

Now your Git sends a name, such as refs/heads/dev. Their Git now looks to see if they can set this name. There are two cases:

  • They don't have a refs/heads/dev branch: it's pretty safe to just create it, so they probably will. (You can set up fancy rules on the receiving side about what to allow or refuse, hence we can only say "probably" here.)

  • They do have a refs/heads/dev: they'll check to see if changing it from whatever hash it has now, to point to commit D, will keep all reachable commit objects still reachable. That's easy to do: is the commit to which their dev points now an ancestor of D? If so, the push is OK. If not, the push gets rejected as a non-fast-forward.

Using git fetch is almost, but not quite, symmetric. When you git fetch from some other repository, your Git has their Git list all their branch names and hash IDs, by default. Your Git then asks for all commits that they have that you don't, along with all their history that they have that you don't. At the end, though, instead of creating or adjusting local branch names, your Git sets up remote-tracking names such as origin/master and origin/dev.

(Technically these are refs/remotes/origin/master, in a separate name space form branch names, so that there's no chance of collision. In practice, as long as you don't name your own branches origin/whatever that's not a problem anyway.)

The last bit that's the most confusing initially

If your repository is initially created by cloning (which internally does a git fetch) some existing repository, you start out with:

...--o--...--o   <-- origin/master
      \
       o--...--o   <-- origin/dev

and the like. These remote-tracking names make all the commits reachable, so that they're not all garbage collected. But then where does your local branch master come from?

The trick is this: git checkout will create a new local branch name out by deconstructing the renaming that git fetch did with the other Git's branch names. We know that our origin/master must have been their master, so git checkout master, when we don't have a master, will search for origin/master. If that exists, Git just created a new label, in the branch name space, giving us:

...--o--...--o   <-- master (HEAD), origin/master
      \
       o--...--o   <-- origin/dev

To create your own new labels, use git branch <name> <hash-id-or-other-commit-specifier>, which creates the refs/heads/<name> name pointing to the given commit. You can also use git checkout -b <name> <specifier>, which does the name creating and immediately does a git checkout of the new name to attached HEAD to it.

Cleanup

There's one more important bit to know, often not covered right away because this graph stuff overwhelms everyone. This is refspec syntax. Both git push and git fetch use refspecs, although they treat them differently.

A refspec is mostly just a pair of reference names separated by a colon. The name on the left is the source name and the name on the right is the destination name. There's also an optional leading plus sign, which means "force": update the name even if the update is not a fast-forward.

When you use git fetch origin, the default refspec is:

+refs/heads/*:refs/remotes/origin/*

This means that your Git matches the other Git's branch names (refs/heads/*), but turns all those names into your own remote-tracking names (under refs/remotes/ and furthermore under origin/—this leaves room for additional remotes).

If you omit the destination name in a git fetch command, Git doesn't write any names. This leaves the fetched commits subject to garbage collection, but there's a delay, because Git first does record the hash IDs into .git/FETCH_HEAD, where you can retrieve them and where they act as temporary retainers. The FETCH_HEAD contents get overwritten by the next fetch, so they are not as good as real names in the name-to-ID database.

For git push, however, the default is generally to push a branch name to the same name on the other Git. That is, git push origin master really means git push origin master:master (and Git fills in the refs/heads/ part when it discovers that master is a branch name). For more on how Git looks up these names, see the gitrevisions documentation.

torek
  • 448,244
  • 59
  • 642
  • 775
0

When you type git push with no other arguments, the changes will go to the specific remote branch that your local branch tracks. If there is no remote tracking branch, then nothing will happen and git will complain to this effect.

Remote tracking is just another way of saying "this branch will be kept in sync with remote branch origin/xzy". Git will automatically set up this synchronisation when you check out a remote branch using the form git checkout remotebranchname. However, if you just create a local branch (as you did) git won't automatically set any remote tracking branch.

So with that context, let's consider your example. You have checked out a remote branch using git checkout UnnamedBranch that git will automatically set to track origin/UnnamedBranch. Then, you've used git branch SomeUniqueName to create a new branch off UnnamedBranch. Git will not automatically set this branch to track any remote branch.

That then answers your question. At this point when you git push from SomeUniqueName, nothing will happen, and git will complain to you.

Now, if you had for some reason managed to push changes to origin/UnnamedBranch from your local SomeUniqueName (which is possible, see git push for other forms), then you can revert that.

git checkout UnnamedBranch
git push -f

This will reset the remote branch to match your local version of UnnamedBranch. Keep in mind this will affect the branch history of anyone who pulled updates from UnnamedBranch in between your mistake and you fixing it. Only do this carefully and in consultation with your colleagues.

Cody Hamilton
  • 331
  • 1
  • 4