-3

I have a branch with name task. I checked out to the commit i did one day ago like this:

git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514

Then I wrote some code and commited it like this:

git add .
git commit -m "message"

Then I switched branch with:

git checkout dev

and now I don't know how to back to that branch. When I move to the branch task, the HEAD is not my last commit, but HEAD commit is the same as commit before I made:

git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514
AnorLondo
  • 37
  • 1
  • 7
  • You have to `--amend` the commit. You were not on a branch - you were in detached state – fredrik Aug 17 '21 at 14:03
  • 4
    _"back to that branch"_ - you were not on a branch. You committed the code in a detached head state. Use `git reflog` to find the commit ID of the new commit you made – evolutionxbox Aug 17 '21 at 14:03
  • You made a commit while in detached head mode. – matt Aug 17 '21 at 14:04
  • 1
    [Why did my Git repo enter a detached HEAD state?](https://stackoverflow.com/questions/3965676/why-did-my-git-repo-enter-a-detached-head-state) – Joachim Sauer Aug 17 '21 at 14:04
  • **Why** did you do `git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514` instead of `git checkout mybranch`? – Joachim Sauer Aug 17 '21 at 14:04
  • 1
    In future after checking out a commit, create a branch `git switch -c my-branch` – evolutionxbox Aug 17 '21 at 14:06
  • Does it mean that i lost code I wrote from the moment I did: ```git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514``` ? – AnorLondo Aug 17 '21 at 14:08
  • 3
    @AnorLondo no it's not lost. You've committed the code. Use `git reflog` to find the commit ID of the unknown commit and then you can check it out, then create a branch – evolutionxbox Aug 17 '21 at 14:09
  • @AnorLondo: it's "lost" in the sense that it's not tracked by a branch. But you can fix it with `reflog`. To avoid it, read the question about detached HEAD: it tells you how to got into this mess and what to do to avoid it in the future. – Joachim Sauer Aug 17 '21 at 14:38
  • "I checked out to the commit i did one day ago like this:" `git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514` So the short answer is, that was a bad thing to do. You probably meant `git reset --hard 53ba56` though it's a bit difficult to tell. – matt Aug 17 '21 at 16:54

1 Answers1

2

You will need to dig around in the reflog for HEAD, find the "missing" commit's hash ID, and save hash ID somewhere—probably, in a new branch name, if only temporarily. That is:

$ git reflog
<inspect the output>
$ git show <hash>      # for some hash ID that looks likely
# if that's the right hash:
$ git branch temp-save-branch <hash>

Each of the things in <angle brackets> represents something you wouldn't type literally at your command prompt, and each of the things after the pound sign # is a comment (which you wouldn't type in at all).

To understand this, read on.

Git is all about commits

Those new to Git, as you probably are, often think that Git is about files. It's not—though commits do store files (but not folders, just files!). Or, they might think Git is about branches. It's not about those either, though commits are organized into branches, and branch names help us—and Git—find the commits. This means that it's crucial for you to understand what a Git commit is, and what it does for you.

What it is, is a numbered thing (of type commit internally, as there are actually four internal types, not that you'll normally use anything but commits and—sometimes—tags). The numbers aren't simple counting numbers though: we don't have commit #1, followed by #2, then #3, and so on. Instead, each number is very large, and quite random looking, and usually expressed in hexadecimal, such as 53ba56611358e90d4990c3a8642e46c7bc93e514.

What a commit does for you is to store two things:

  • First, each commit stores a full snapshot of all of your files. These files are in a special, read-only, Git-only, compressed and de-duplicated form. The de-duplication takes care of the first objection most people make to the fact that every commit stores every file, again and again: if you make a commit that has a thousand files, then change one file and make a new commit, the new commit re-uses 999 files. It only really has to store one new file. Both commits still store all 1000 files, it's just that 999 of them are shared.

    Because these files are read-only, Git-only, compressed, and de-duplicated, you can't actually use them. That's why git checkout does what it does (which we'll see in a moment). They can be shared because they're read-only: they literally can't change. In fact, no part of any commit can ever change, including ...

  • The other part of each commit is the metadata, or information about this particular commit itself. The metadata is where Git stores the name and email address of the person who made the commit, for instance. Git keeps two name + email + date-time-stamp triples per commit: one for the author, and one for the committer, even though both are usually the same. Git also keeps in here some of its other internal maintenance items, and any commit log message the author/committer chooses to include, to describe why they made that particular commit.

    Critical for Git's own operation, each commit stores the hash ID (or IDs) of a previous commit (or some number of previous commits). Most commits store exactly one such hash ID, and that's all we'll look at here. We—and Git—call this stored hash ID the parent of the commit. That makes this commit the child.

Note that the child knows its parent, but the parent doesn't know its children. This falls out of the fact that all commits are—necessarily, due to Git's commit numbering system—completely read-only. When we make a child commit, its parent already exists, so the child can store that hash ID. But when we make the child, we don't know whether it will have any future children, much less what their hash IDs will be. The hash ID of some future commit depends not only on what goes into that commit, but also the exact date and time that we (or whoever) make it. If we try to predict a future commit's hash ID, we have to get the time right, down to the second (and, due to some fancy cryptography math, even that's not enough).

What all of this means is that Git works backwards. Each commit remembers its parent, which means that commits form backwards-looking chains. If the last commit in some chain has some hash ID H, we might draw that commit like this:

            <-H

The little backwards-pointing arrow coming out of H points to its parent, which has some other random-looking hash ID; let's call that one G and draw it in:

        <-G <-H

Of course, G likewise points backwards to its parent. Let's call that F and draw it in:

... <-F <-G <-H

This goes on all the way back in time to the very first commit, which—being the first commit—doesn't have a parent, so it just doesn't bother to point backwards. The whole chain, then, is:

A--B--C--D--E--F--G--H

where we get lazy and stop drawing the arrows as arrows (mostly because it's too hard in a text-only StackOverflow answer).

The nice thing about this is that, given the hash ID of the last commit in the chain—H in this case—we can have Git itself find all the earlier commits. Git uses the hash ID that we (somehow) give it to locate H. H itself locates G, which locates F, and so on, all the way back to A. A git log stops after showing commit A since there's nothing left to show, although we typically will quit out of a git log long before it gets all the way back to the start anyway, in a real repository with thousands of commits.

But the thing is, Git really needs the hash ID of H, to find it fast. It's possible, in a small enough repository, to go through every commit and figure out which one(s) are "at the end", but that's hard and slow: in a big repository with hundreds of thousands of commits, it would take seconds, or maybe even minutes. We like our answers in nanoseconds, or maybe milliseconds at worst. So we need to save H's hash ID somewhere. We could write it down on paper, or scribble it onto a whiteboard at work—but that's just silly: we have a computer! Let's have the computer save H's hash ID somewhere.

Enter branch names

A branch name, in Git, is just a way for us to have the computer save the last hash ID. That is, if H is an interesting commit, because it's the last commit on—say—main, we tell Git to scribble H's hash ID into a table somewhere under the name main. We draw that like this:

...--G--H   <-- main

If we'd like a new branch, such as dev, now, we can just create a second name, and have that name store H's hash ID too:

...--G--H   <-- dev, main

That might seem a little silly at first, but read on.

HEAD determines the current branch

Once we have several branch names, we need a way to have Git know which name we are using. To let Git know, we'll "attach" the special name HEAD to exactly one branch name, like this:

...--G--H   <-- dev, main (HEAD)

We are now on branch main, as git status would say. Or:

git checkout dev

We are now on branch dev because the picture has changed:

...--G--H   <-- dev (HEAD), main

Either way, git checkout will extract commit H's files, for us to see and work on/with. The files in commit H are, after all, read-only, and usable only by Git itself. They're not in the normal folder-with-file form.1 So git checkout has to copy the files out of the commit. We won't go into any detail here, but this itself gets a bit complicated.


1Git may also be able to store files whose name our OS can't handle. If we are on Windows or macOS, this happens all too often: some Linux user makes a file named aux.h, which Windows can't handle, or schön but spelled with the wrong UTF-8 byte sequence so that our macOS system gets confused. Or, they create both readme and README, and our case-folding file system can't handle it.


HEAD also determines the current commit

Note how at this point, there's only one commit to use, regardless of whether we tell Git to git checkout main or git checkout dev:

...--G--H   <-- dev (HEAD), main

So whichever one we use, we'll have the files from H. But let's make a new commit now, while we're in this setup, just like this. We edit some file, run git add for reasons we haven't covered, and run git commit. Git:

  • packages up a snapshot of every file, including the one we changed-and-added;
  • collects a log message from us, if we didn't give it one yet;
  • adds the current commit hash ID (whatever H is short for) as the parent;
  • adds the rest of the metadata; and
  • writes out a new commit.

The act of writing out a new commit produces a new, random-looking hash ID, but we'll just call this new commit I. New commit I has H as its parent. H was the current commit because HEAD said dev and dev said H. So, once this new commit goes in the repository, the sequence of commits is now:

...--G--H
         \
          I

But because we ran git commit, and because HEAD is attached to dev, the last step of git commit was to write the new commit's hash ID into the name dev. So now we have:

...--G--H   <-- main
         \
          I   <-- dev (HEAD)

Note that while HEAD is still attached to the name dev, the name dev now selects the new commit I.

New commit I is the last commit on dev now. Commit H remains the last commit on main. Commits up through H, which were on both branches, are still on both branches. But new commit I is only on dev right now.

If, later, we have Git move the name main forward one step—this is a little bit tricky since Git really works backwards, but we can do it—we'll end up with:

...--G--H--I   <-- dev, main

and all the commits will be on both branches again (after which we can delete either of the two names: the other name suffices to find the commits). But for now, at least, we need both names, so as to know that H is the last commit of main and I is the last commit of dev.

Detached HEAD mode

One of the points of doing all of this stuff with commits—which save every file forever for us—is so that we can "go back in time", as it were, and look at how our software was yesterday, or last week, or last year. To do that, we use the kind of command you ran:

git checkout 53ba56611358e90d4990c3a8642e46c7bc93e514

This uses what Git calls detached HEAD mode. Here, instead of attaching HEAD to some branch name, Git just stores the raw hash ID directly into HEAD. We can draw that like this:

...--F   <-- HEAD
      \
       G--H   <-- dev

for instance.2 Git removes the files from whatever commit we had out before (H, say) and populates our working tree with the files from commit F instead, and now we can see what our project looked like back then.

But: what if we make a new commit while in this mode? That's exactly what you did. Git's answer is: it makes a new commit the same way as always. You make some changes, git add the updated files, and run git commit, and Git packages up a snapshot and adds metadata, and makes a new commit.

The parent of the new commit is the current commit, F. So let's draw our new commit I now. It must reach back to commit F:

       I
      /
...--F
      \
       G--H

That part's pretty straightforward. But what happens to the names? Well, we aren't on dev, so dev continues to point to H. Git would normally write I's hash ID into the name to which HEAD is attached, but HEAD isn't attached to any name. So Git just writes the new commit's hash ID into HEAD itself, directly, like this:

       I   <-- HEAD
      /
...--F
      \
       G--H   <-- dev

If you then run git checkout dev, Git:

  • erases the files for commit I from the working tree;
  • fills in the working tree with the files from H; and
  • attaches HEAD to the name dev.

The result is:

       I   ???
      /
...--F
      \
       G--H   <-- dev (HEAD)

Where is commit I? It's still there. The problem is, there is no name for it. You will have to find its raw hash ID, and give that to git checkout or git branch. If the actual hash ID were abcdabc, you could run:

git checkout abcdabc

and you'd be back to the detached-HEAD mode, with HEAD pointing to commit I again. Then you would run:

git branch save-this

or whatever, and get:

       I   <-- HEAD, save-this
      /
...--F
      \
       G--H   <-- dev

Or, you could short-cut a bit and run:

git branch save-this abcdabc

and get:

       I   <-- save-this
      /
...--F
      \
       G--H   <-- dev (HEAD)

Since we didn't do a git checkout, HEAD remains attached to dev and commit H remains the current commit, but commit I (or abcdabc) now has a name and is easy to find.

The trick here is that you must somehow find the hash ID of the commit. That's where git reflog comes in: running git reflog, with no arguments, spills out the reflog entries for the special name HEAD. One of those will be the hash ID of the commit you made. All commit hash IDs look like random junk, of course, so you'll have to rely on something else to find the right one. The output from git reflog includes a short bit from your commit, for instance, or you can run git show on each commit hash ID, because git show will show you the commit.


2I've left out main, but probably main or master exists and therefore points to some commit somewhere in your repository. This doesn't really matter for our purposes. As long as you're not using the name main or master to remember some specific commit, you can even just delete it entirely. We typically end up with main or master in a repository we clone from some centralized depot somewhere (maybe on GitHub) because they had a main or master and our Git made our main or master based on theirs. But that means we have an origin/main or origin/master too, and that name remembers the right hash ID. We don't need our own main or master, we can just use that non-branch-name for most purposes. We only need our own branch name if we need it for "branch-name-y" operations, like making new commits.


Side note: git show shows diffs

I've said several times now that every commit is a full snapshot of every file. Yet, if we run git show commit-specifier, we see stuff like this:

$ git show HEAD
commit eb27b338a3e71c7c4079fbac8aeae3f8fbb5c687 ...
Author: ...
Date:   ...
...
diff --git a/Documentation/RelNotes/2.33.0.txt b/Documentation/RelNotes/2.33.0.txt
index d4c56de5cb..a69531c1ef 100644
--- a/Documentation/RelNotes/2.33.0.txt
+++ b/Documentation/RelNotes/2.33.0.txt
@@ -30,6 +30,9 @@ UI, Workflows & Features
 
  * The userdiff pattern for C# learned the token "record".
 
+ * "git rev-list" learns to omit the "commit <object-name>" header
+   lines from the output with the `--no-commit-header` option.
[snip]

If a commit is a snapshot (and it is), why do we see a diff? The answer is: because Git figures one out to show, when you run git show (or git log -p, for that matter). If we're at some commit:

...--G--H   <-- main (HEAD)

it's easy for Git to turn HEAD into main and hence into hash ID H. But it's also easy for Git to then use hash ID H to retrieve the metadata for commit H, from which Git finds the hash ID G. Git can now use G and H to retrieve the entire snapshots in both G and H, and now all it has to do is compare them.3 The result of this comparison is usually far smaller than the set of files inside the snapshot, and also much more useful for a human. So that's what git show shows.

Note that merge commits are commits that (by definition) have at least two parents, instead of the usual one. In this case, it is not clear which parent(s) Git should compare to the child. What git log -p does, at least by default, is simple, though often useless: it just doesn't bother to show a diff at all. What git show does is more complicated; we won't cover that here.


3"All", I say, as if that's just a Small Matter of Programming. However, the internal format for commit snapshots, with its de-duplication tricks, makes it trivially easy for Git to tell if two files are the same. So Git can skip over every unchanged file easily. Git only needs to run a complicated diff engine algorithm over two files that aren't the same, in the two snapshots.

torek
  • 448,244
  • 59
  • 642
  • 775