There are many different ways to manage this kind of work-flow in Git, because Git is a set of tools, rather than a particular solution.
When working with Git, keep all of these things in mind:
Git is really all about commits. Commits (like all Git internal objects) are completely immutable once created.
Commits store snapshots of your files: not changes, just snapshots. In some ways this doesn't really matter, but it makes it a lot easier to understand some weird Git corners. Since the commits are immutable, so are the stored files. These stored files are also compressed (sometimes in very fancy way) and de-duplicated, and not usable by anything other than Git itself. They have to be extracted from a commit for you to use them (see below).
Each commit has a parent commit, or for merge commits, two (or potentially more) parents. These are stored by hash IDs.
The hash ID of each commit is how Git actually retrieves the commit, so hash IDs are crucial here. But hash IDs look random and are completely unsuitable for human use, therefore ...
Git gives us branch names, which simply store the hash ID of the last commit that we'd like Git to think of as being "on the branch".
A Git repository, then, is essentially just two databases. One contains all the commits and supporting internal objects—all indexed by hash ID—and the other contains the name-to-hash-ID mapping. The files that you work with, as you do your actual work, aren't in Git at all. That is, they're not the ones that are in the repository database. They are just in an auxiliary area.
This work area, where you work on files, is your working tree or work-tree. If you use git clone
to make your repository, Git creates the work-tree for you. If you build your own work-tree before you create a .git
with git init
, you have already set up the work-tree, and the git init
step just creates a new empty repository—two empty databases, one for commits and the other for names.
When you check out some particular commit—whether that's by an explicit check out historical commit so I can view it or implied by check out some branch by name so that I can do new work—Git will extract the saved snapshot into your work-tree. But apart from this and other Git commands that explicitly tell Git do something with my work-tree file(s), these files are now yours to do with as you will.
There's one more wrinkle here. When you go to make a new commit, Git does not use the files in your work-tree. Instead, Git uses hidden copies—technically these aren't exactly copies, but it works to think of them like that—of the frozen-format files that get stored in commits. These "copies" live in what Git calls, variously, the index, or the staging area, or—rarely these days—the cache.
The index has multiple roles, but you can think of it as containing the proposed next snapshot. It starts out matching the current snapshot, as extracted into your work-tree. The git diff --staged
and git status
commands don't show you the index copies of the files because they're the same as the snapshot's copy of the files. When you use git add
, you're telling Git: Copy my work-tree copy of the file back into the index, replacing the old copy, or putting an all-new file into the index. Now that the index copy doesn't match the current-commit copy, git status
and git diff --cached
will show you that file. Change it back—to match the committed copy—and they'll stop showing the file again.
Since your work-tree is yours, you can create files that never get into Git at all. These are your untracked files. To help ensure you don't accidentally put the untracked files into the repository, you can list files or name-patterns in .gitignore
. Note that once a file is tracked, listing it in .gitignore
has no effect.
A file in your work-tree is tracked if and only if that file exists right now in Git's index. While this definition is very short and simple, it has a long shadow: Since Git fills in its index from a commit—via git checkout
or git switch
—this means that a file that exists in commit X but not in commit Y can switch from being tracked to untracked, or vice versa, just by changing which commit you have checked out. You can also modify, create, or remove specific files within the index yourself, with git add
and git rm
. Whenever you do this, you're changing the proposed next commit. None of this has any effect until you actually run git commit
.
With the above in mind, we're ready to tackle your particular case
Let's jump right to step 3:
Perfect, taskA
is working! Now I want to start taskB
. This new task is based upon taskA
so I create the new branch from it and start working.
Since commits refer back to earlier commits, let's draw this. Suppose the hash ID of the last commit in your taskA
branch (which now exists) is H
, where H
stands in for the real Git hash ID. Then the name taskA
is a way for Git to remember hash ID H
for you. Commit H
itself has a parent, with another big ugly hash ID, but we'll call that parent G
. G
has a parent, too, which we'll call F
, and so on:
... <-F <-G <-H <--taskA
The name taskA
selects commit H
(for now). The name taskB
does not even exist yet.
Now you create taskB
. I'm going to switch from drawing the internal arrows, which point backwards from each commit, to lines because the arrow drawing character set for posting here on StackOverflow is poor, but this just adds another name, taskB
, that also selects commit H
:
...--F--G--H <-- taskA, taskB (HEAD)
We now need to know which name we're using, as well as which commit, so we'll attach the special name HEAD
to one of these two branch names.
Now working in taskB
I realize that I could some things in my code to improve taskA
. Let's say some files changed in taskB
should now be committed to taskA
.
This is where you suddenly get a lot of options.
My favorite one for descriptive purposes is git worktree
, specifically git worktree add
. But git worktree
was new in Git 2.5, and had a nasty bug finally fixed in 2.15, so unless your Git is reasonably modern, you might want to avoid it. It's also going to create a little bit of extra work for you, if you go this way, but it's a very general solution.
What git worktree add
does is let you add a second work-tree to your existing repository. Each added work-tree gets:
- its own
HEAD
, so that it can (and in fact must) have a different branch checked out;
- its own index, i.e., proposed next commit; and
- of course, its own work-tree full of files.
So you can use git worktree add
to make two independent work areas, each of which is "on" a different branch. You can then just take this moment to:
- open a new window on (or push directories to) the work-tree in which you're working on
taskA
;
- modify the files there, however you like—up to and including copying them from the work-tree where you are working on
taskB
—and git add
and git commit
.
Let's say you do make a new commit in this added work-tree. We can draw that. We start with this:
...--F--G--H <-- taskA (HEAD), taskB
and modify some files, git add
, and run git commit
. This makes a new commit—which gets a new big ugly hash ID; we'll call it I
—and makes the name taskA
point to this new commit:
I <-- taskA (HEAD)
/
...--F--G--H <-- taskB
If we switch back to the taskB
window / work-tree, where taskB
is the HEAD
, we have:
I <-- taskA
/
...--F--G--H <-- taskB (HEAD)
The files in the work-tree here—the original one, not the added one—match those of commit H
, except for any changes you've made so far. The files in the index for this work-tree match those for commit H
. Any new commit you make now will update the name taskB
like this:
I <-- taskA
/
...--F--G--H
\
J <-- taskB (HEAD)
Again, the new snapshot comes from the index. The commits have not changed: we've merely added some new ones. The parent of new commit I
is existing commit H
. The parent of new commit J
is existing commit H
. Commits up through and including H
are on both branches, but the branches now diverge.
What if you don't want to use git worktree add
Remember that Git makes each new commit from the index, not from your work-tree. Suppose we have:
...--F--G--H <-- taskA, taskB (HEAD)
but the index matches commit H
. Git will let you switch back to taskA
without disturbing the index and work-tree content at all. (This is not always true, but it is true given our suppositions and setup here. For the gory details, see Checkout another branch when there are uncommitted changes on the current branch.) So let's say we do that:
git checkout taskA # or git switch taskA
...--F--G--H <-- taskA (HEAD), taskB
Now we just git add
the one or two files you would like to be different—which copies them into the index, ready for the next commit—and then run git commit
. Since we're using the index files, not the work-tree files, updated you made but did not git add
do not go into the new snapshot.
We get:
I <-- taskA (HEAD)
/
...--F--G--H <-- taskB
exactly as before. The name taskB
does not move.
When we now git checkout taskB
, Git sees that the files we just updated are different in commits H
and I
. So Git will copy H
's copy of those files out of the commit, into Git's index (so that they match H
) and your work-tree, and the changes you just made for taskA
are gone. But we can bring them back into the work-tree:
git checkout taskA -- file1 file2
or (since Git 2.23):
git restore -s taskA -i -w file1 file2
which tells Git: reach into the commit identified by the name taskA
—commit I
—and pull out these two files and copy them into the index and my work-tree. So now you're back to having the updated files, along with all the other undisturbed files. The updated files are already changes staged for commit, as git status
will say, as they're in the proposed next commit in the index.
You can now finish up the stuff you were doing, git add
, and git commit
as needed, giving:
I <-- taskA
/
...--F--G--H
\
J <-- taskB (HEAD)
exactly as before.
You may now want to rebase taskB
However you got to this point, you now have taskB
extending from commit H
, rather than from commit I
as you might wish.
No commit can ever be changed, but any commit can be copied. What if we copy commit J
to a new-and-improved commit—let's call it J'
—where the snapshot in J'
matches the snapshot in J
, plus any changes from H
-to-I
if needed? (They're already in J
so they are not needed, but Git would put them in if they were.)
We can get this by using git cherry-pick
. We first create a new temporary branch temp
, pointing to commit I
:
I <-- taskA, temp (HEAD)
/
...--F--G--H
\
J <-- taskB
Now we tell Git: copy commit J
to where we are now:
git cherry-pick taskB
which produces:
J' <-- temp (HEAD)
/
I <-- taskA
/
...--F--G--H
\
J <-- taskB
Note that, yet again, we have not changed any existing commit at all. We have just added a new commit J'
whose parent is I
.
Now that we have copied all the taskB
commits (all one of them) to new-and-improved commits, we just need to tell Git: Take that name taskB
and move it in a way such that we'll forget all about the old commit J
. Specifically, force taskB
to point to the current commit. We do this with:
git branch -f taskB HEAD
which results in:
J' <-- taskB, temp (HEAD)
/
I <-- taskA
/
...--F--G--H
\
J ???
Note that there is now no name by which to find existing commit J
. So when you have Git list out the commits it can find by branch names, commit J
does not show up at all. A new and different hash ID—that of J'
—does. Now we just switch back to branch taskB
and delete the temporary name and we have:
J' <-- taskB (HEAD)
/
I <-- taskA
/
...--F--G--H
as if we had been clever enough to make commit I
first all along.
We don't need to use four separate Git commands—create temp branch name, cherry-pick commits to make new-and-improved-copies, forcibly move old branch name, delete temp branch name—because git rebase
does that for us. That's what git rebase
is really about.
The one drawback to rebasing: if someone else has the commit(s)
You did not mention doing a git push -u origin taskB
above, but if you had done that, you would have sent a request to another Git, the one over at origin
, to take any new commits you have that they don't, that they need, and then to create, in their Git repository, their branch name taskB
, pointing to whichever commit your name taskB
points to.
When you use git rebase
you tell your Git: Copy some commits to a new place, then throw out the old commits in favor of the new-and-improved copies. Your Git obeys. If you now have your Git ask their Git to update their taskB
name:
git push origin taskB
they will sometimes say no! In particular, they will see whether this action will drop some commit(s) from their taskB
. If that's the case, they will reject the push with the error non-fast-forward. But of course that's just what you would want in this case: you made some commits, then you made new-and-improved commits and they should lose the old ones. To get them to do that, you will need a more forceful git push
.
Whenever a branch gets rebased regularly, all users of any shared Git repository should be aware of this. That's because ... well, suppose Alice pushes a commit. Then Bob gets Alice's commit from the shared repository, and starts building his own additional commits. Then Alice changes her mind and rebases, throwing out the old commits in favor of new-and-improved ones. But Bob still has, and has based his commits on, the old ones! Alice and Bob are in effect fighting over which commits are the good ones.
This is not all that hard to deal with technically, usually. For instance, here, Bob just needs to rebase his commits on Alice's new ones, dropping Alice's old ones. If everyone agrees in advance that this sort of thing happens, and knows how to deal with it, that's no problem.
If the origin
repository is private (so there are not separate users Alice and Bob), or your branch on origin
is private (same condition), or everyone agrees that rebasing happens (Alice and Bob are both ready to check these things), there is no problem here. Just be aware of the pitfalls. Consider using git push --force-with-lease
as a safety check, too.