Git create feature branch before squash and merge

Question

This is probably a process that I am still learning. Following is the current code review cycle we are following in Git.

Write the code in feature branch 1.
Create a new PR to review this code.
Once review is approved, "Squash and Merge" the feature branch 1 to master.

Can I create a feature branch 2 without waiting for the review to be completed for feature branch 1 and still get all the latest changes?

Process I was following till now:

Write the code in feature branch 1.
Create a new PR to review this code.
Wait until the review is completed for feature branch 1 before creating any more new branches.

Can someone suggest a proper process?

It depends on what you mean by "and still get all the latest changes" — do you want the changes from branch 1? If so, just create branch 2 off of branch 1. If branch 2 doesn't depend on changes from branch 1, you can just create branch 2 off of your main branch. There's no reason to wait until branch 1 has been merged before continuing work on another task. — Scott, Jun 07 '22 at 03:34
If I create branch 2 from my main branch, how to merge branch 2 changes to main? Just squash and merge ? — Sushivam, Jun 07 '22 at 03:38
Yes, just create a PR for branch 2 into main; if there are any conflicts with changes that may have come in from branch 1 (or any other branch), you'll have to resolve them before you can merge, but otherwise you can create N number of branches and N number of PRs all at once. — Scott, Jun 07 '22 at 03:40
So this is the process: 1. git checkout master 2. git checkout -b feature_branch2 3. When all reviews are completed, squash and merge. 4. Resolve conflicts if any . — Sushivam, Jun 07 '22 at 03:46
Will I able to resolve conflicts if I am using "Squash and Merge" option from Git GUI. I mean without command line interface? — Sushivam, Jun 07 '22 at 03:47
When you attempt to merge, git will notify you if it encountered a conflict, at which point you'll have to resolve them, then mark the files as resolved in git. I'd recommend researching some basics of git branching/merging: https://www.atlassian.com/git/tutorials/using-branches — Scott, Jun 07 '22 at 04:05

score 2 · Answer 1 · answered Jun 07 '22 at 19:03

The short answer to your question as asked is—as Scott noted in a comment—that it depends on what you mean by "and still get all the latest changes". But I think your question itself reveals a fundamental misunderstanding of Git that will cause you problems in the future.

The first tricky thing here is to realize that Git does not "store changes" in the first place. What Git stores are commits. Commits are what Git is all about, and each commit stores a full snapshot of every file, not some set of changes. That's not everything that a commit stores, but it's crucial to understand that a commit holds a snapshot.

The second tricky thing is to understand what a squash merge is and does in Git. But we cannot tackle this until we cover commits properly.

A Git commit is ...

A commit, in Git:

Is numbered: every commit has a unique number, which Git calls an object ID or less formally a hash ID. This is a very large number, normally expressed in hexadecimal, such as ^{_{ab336e8f1c8009c8b1aab8deb592148e69217085}}. Git needs this number in order to find the commit. Every Git everywhere in the universe uses the same number for this particular commit (this is a commit in the Git repository for Git). If you have a commit with this number in your Git repository, it is this commit; if you don't have this commit, none of the commits in your repository use this number.
Is read-only: no part of any commit can ever be changed after it's made. This is required to make the numbering system work.
Holds two things: the snapshot mentioned earlier—a full copy of every file that Git knew about at the time you, or whoever, made that snapshot—and metadata, or information about the commit itself, such as the name and email address of the person who made it.

The full copy of every file is stored in a clever way: the files are all read-only, compressed—sometimes very compressed—and de-duplicated. This means that new commits take almost no space, because most new commits mostly re-use files from old commits, and as those are duplicates they occupy no space at all. Other files are only slightly changed from some previous file, so that the clever compression techniques can share most of the space needed for the multiple copies of those not-quite-identical files. All of this cleverness is normally well-managed for you, by Git, so that you don't need to think about it or do anything special; it Just Works.

Some of the metadata in a commit is crucial to Git's own operation, though. In particular, besides the name and email address of an author and committer, and some date-and-time stamps and a log message, Git adds to each commit a list of previous commit hash IDs. Git calls these the parents of the commit. Most commits have just one single parent, and this means that most commits form a nice, neat, simple, but backwards-looking chain.

If we go to draw this chain, we need hash IDs of commits. But hash IDs are too hard for humans to work with, so instead, we'll substitute in single uppercase letters. (Git obviously can't use this for real, as Git would run out of commit "numbers" way too fast, but for simple drawings, we can just pretend.) Let's call the latest commit "commit H" with H standing for hash ID. Commit H contains the hash ID of some earlier commit; let's call that earlier commit G. We say that H points to G, so let's draw this as an arrow coming out of H, pointing (backwards) to G:

          G <-H

But G is a commit, so it must also have an arrow pointing out of it, backwards, to some still-earlier commit. Let's call that commit F. F, too, is a commit, so it has an arrow coming out of it:

... <-F <-G <-H

This is how our backwards-pointing chain of commits works, in Git. We simply (somehow) memorize the hash ID of the latest commit, and give that to Git and say to it: Using this latest commit hash ID H, find me all the commits. Git finds the latest commit H, fishes out the hash ID of the second-to-latest commit G, uses that to find commit G, fishes out the hash ID of an earlier commit F, uses that to find F, fishes out the hash ID of an earlier commit (E presumably), and so on.

By doing this over and over again, Git will eventually reach the very first commit ever in this repository. That commit has an empty list of previous-commit hash IDs, to indicate that it is in fact the very first commit ever and that it is impossible to go back any further. Git will stop here. Git has used one commit hash ID to find every commit on this branch.

Branch—wait, what?

I just used the word branch. The word branch in Git is ambiguous: it has at least two or three different meanings, and people use it casually without saying which of the several meanings they actually mean. You have to get used to this, but it's a good idea to ask if you're not sure. Let's take a look now at some of these meanings.

We noted above that we had to somehow memorize the hash ID of commit H, the latest commit in our chain. But humans are bad at hash IDs. We literally can't memorize them. Then again, if you just think about this for a moment, we don't have to memorize them. We have a computer. Computers are good at this kind of stuff. Let's have Git keep a separate table (or database) of names, and in this table / database, store one hash ID for each branch name. Now instead of memorizing the hash ID of H, we can use the name main. Git will store the hash ID of commit H under this name main:

...--F--G--H   <-- main

The branch name main points to the latest commit in the chain. We call this latest commit the tip commit of the branch, and we call the name "the branch", and we call the set of commits found by starting at the tip and working backwards "the branch" as well.

There are other kinds of names in a Git repository—tag names, remote-tracking names, and the like—and each of those also stores a hash ID, and sometimes we call some set of commits found by starting with a commit found by those names "a branch", too. Git uses the term remote-tracking branch name, rather than remote-tracking name, so sometimes people call things like origin/main a "branch" name. (It's not—quite—and with this overuse of the word branch, that's why I leave that word out and just call it a remote-tracking name.)

Anyway, the other thing about branch names is that Git lets us create and destroy them at any time. Git does not need branch names at all. Branch names are for humans, not for Git. All Git requires of a branch name is that it point to exactly one commit—the tip commit of that branch—and we can make up new ones any time we like:

...--F--G--H   <-- develop, main

Now we have two branch names, both of which point to commit H. If we like, we can add a third branch name, and make it point to an older commit:

...--F   <-- old
      \
       G--H   <-- develop, main

Now we have three branch names, with the name old pointing to (selecting) commit F. Note that no commits have changed in this process (they literally can't). We just have made names that point to particular commit. The backwards-pointing chains still point backwards, to the same commits as before. The fact that we can add and remove branch names at any time:

...--F   <-- old
      \
       G--H   <-- develop

—now there's no main branch at all—means that commits aren't defined by their branch names. The commits exist independent of their branch names. The names are just movable (and removable) labels that help us humans find the commits.

Nonetheless, we often say that some commit is on some branch. All this means is that by starting from the branch name, i.e., the tip commit, and working backwards, we can arrive at the commit in question. For instance commit F is "on" both old and develop branches at this point. If we put the name main back, commit F is on all three branches, and if we remove the two other names and keep just main and get:

...--F--G--H   <-- main

commit F is now only on one branch, namely main.

What this means is that the set of branches that contain any given commit changes over time. Sometimes, we care about which branches contain some commit, but most of the time, we mostly care about what's in the commit itself, and that's the snapshot and metadata. Those never change—they literally can't change—and given the commit's hash ID, if we have that commit, we have that snapshot and that metadata.

Viewing a commit compactly

We often like to talk about the change or changes in some commit. You did this yourself in your question. But a commit doesn't contain changes: it contains a snapshot and metadata. So what does "changes in a commit" mean?

Let's take commit H again for a moment. Its parent is commit G. What if we have Git extract, somewhere—maybe just in the computer's memory—both snapshots and compare them? We—or Git—will find that most of the files are completely identical. Since Git stores files de-duplicated, Git can find these identical files very quickly. So Git rapidly weeds out all the 100%-identical files and only needs to compare the differing files.

When Git compares two differing files, Git can present this difference as a set of changes. This diff output shows up as one or more "diff hunks", with @@-lines and such. For more, see, e.g., In the context of git (and diff), what is a "hunk".

By showing us what changed, Git gives us what we often want. We see commit H as a set of changes. It isn't: it's really a snapshot. But it's a snapshot plus metadata and its metadata find us / Git a single parent commit, which also has a snapshot, and that pair of snapshots lets us see what we want. We see changes, even though Git has snapshots.

Making new snapshots

I'm just going to touch on this briefly here; you should definitely read more about this elsewhere (particularly, learn about Git's index aka staging area). Because a commit is read-only and its files are in a special compressed Git-only format, we literally can't work on a commit directly. Instead, we have Git check out a commit, using git checkout or git switch. We pick a branch name:

...--G--H   <-- main

and switch to it, so that H is our current commit and main is our current branch name. When we do this, Git extracts, to a work area, all the stored files for commit H (first removing the stored files from any previously checked-out commit).

We may now create a new branch name and switch to it, still using the files from commit H, and still keeping commit H as our current commit, but with a new name as the current branch name:

...--G--H   <-- feature, main

To keep track of the current branch name in our drawings, we'll add the special name HEAD in all uppercase and in parentheses, "attaching" this name to one of the branch names:

...--G--H   <-- feature (HEAD), main

Now we modify the files in our working tree—the ones that came out of commit H—and use git add and git commit to make a new commit. (Note: these files are not in Git. Git gives up all control over them while we work on them.) The new commit we make will have some big ugly random-looking hash ID, but let's call this commit I, and draw it in:

...--G--H
         \
          I

New commit I will point back to existing commit H, because we used commit H in order to make commit I. Now, having made new commit I, Git stores I's hash ID into the current branch name, so that we get:

...--G--H   <-- main
         \
          I   <-- feature (HEAD)

This is how we add commits to a branch using git commit. New commit I is currently only "on" (reachable from) branch name feature, where it is the tip commit of that branch. Commits up through H are still on both branches. You can keep on adding more commits:

...--G--H   <-- main
         \
          I--J   <-- feature (HEAD)

Now commits I-J are only on feature.

We can switch back to branch main with git switch main or git checkout main, giving this:

...--G--H   <-- main (HEAD)
         \
          I--J   <-- feature

We now have the files from commit H in our work area; those in commits I and J are safely saved (forever—or at least, as long as the commits themselves continue to exist) in those commits. We can now create yet another branch name, feat2 perhaps. For no obvious reason I'll draw I-J on the top row now though, and change the name to feat1 (git branch -m feature feat1):

          I--J   <-- feat1
         /
...--G--H   <-- feat2 (HEAD), main

Now we make two more commits:

          I--J   <-- feat1
         /
...--G--H   <-- main
         \
          K--L   <-- feat2 (HEAD)

Merging

A lot of Git's power, and hence reason to use it, comes from Git's ability to merge. But once again the key is to understand what merge means. What is merging about? At one level, merging is about combining work (i.e., combining changes). At another, it's about combining history.

As we've already seen, Git doesn't store changes in the first place, so when we talk about combining changes we have to define what we're going to mean by changes, and hence work. Our answer will be very similar to what it was for an individual, ordinary single-parent commit, where we found the parent commit and extracted both snapshots—the parent's, and this commit's—and compared them with git diff.

This time, though, we have:

          I--J   <-- br1
         /
...--G--H
         \
          K--L   <-- br2

and we'd like to somehow combine all the work done on br1 with all the work done on br2. As we saw earlier, some commits here are on both branches: commits up through and including H. So we don't need to worry about those commits: they're already "combined" in all possible ways! They're literally the same commits. But since commit H, we had new work happen on br1, in commits I and J, and since commit H, we had new work happen on br2, in commits K and L.

What we'll do, then, is have Git extract the files from commit H—the common starting point, or what Git calls the merge base commit—and set them in some area. Then we'll have Git extract the files from J, skipping right over I, and compare—i.e., git diff—the snapshots from H and J.

We could have Git compare the snapshot in H vs that in I, then compare the snapshot in I vs that in J. But if we think about it for a while, we'll see that adding those two diffs together will usually give us the same difference that we get by jumping straight from H to J. So that's what Git does here. The resulting diff is what changed on br1 (since the common starting point).

Then, we have Git compare the snapshot in H again, this time vs the snapshot in L. This is what changed on br2 (again, since the common starting point).

The merge process—what I like to call merge as a verb—now combines these two sets of changes. Git then applies the combined changes to the snapshot in H, the merge base. This "keeps the br1 changes and adds the br2 changes", or "keeps the br2 changes and adds the br1 changes": the result is the same either way, as long as there's no conflicts. Git defines a conflict as a case where one "side" makes a different change to an overlapping or abutting set of lines as the other "side". For any changes to a single file that don't overlap or abut, Git adds those together. For a change that overlaps exactly, but where both sides make the same change—perhaps both fix the same typo in the same word, for instance—Git just takes one copy of the change. The "changes" Git sees here are those shown by git diff: add some line(s), remove some line(s).

If all goes well, having successfully applied the combined changes to the snapshot in H, Git is now ready to make a new commit—a new snapshot and metadata—and Git does in fact make this commit, but there's one thing about this commit that's special. While it has a snapshot (like any commit) and metadata (like any commit), its list of parent commits has not just one hash ID in it, but two.

The first parent of new merge commit M is the commit we selected by our git checkout or git switch. The second parent is the commit we named on the git merge command line. Say we ran:

git switch br1
git merge br2

Then new commit M goes "on" branch br1, because that's the branch we're "on", and it looks like this:

          I--J
         /    \₁
...--G--H      M   <-- br1 (HEAD)
         \    /²
          K--L   <-- br2

That is, parent #1 of commit M is commit J. Commit M is the new tip commit of branch br1. Parent #2 for M is commit L: the commit we merged.

The first-vs-second parent stuff is potentially useful later, though we'll skip all the details here for space reasons. The fact that the merge commit has two parents affects how Git views history, but again, we'll skip all this for space reasons.

Squash merges are not merges

If we start with the same situation:

          I--J   <-- br1
         /
...--G--H
         \
          K--L   <-- br2

and run git switch br1 && git merge --squash br2, Git will go through most of the same actions as for a real merge: Git will find the merge base H, run the two git diff commands, combine the changes, and make a new commit. But the new commit is not going to be a merge commit. Instead of having two parents, it will have only one:

          I--J--S   <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

The snapshot in our squash commit S will exactly match the snapshot we would have gotten with a true merge. But commit S has only one parent. In other words, there's no joining-up of history.

Having made a squash merge like this, there's only one thing to do with branch br2 in a normal Git work flow, and that is to delete the name. Once we delete the name br2 we have this:

...--G--H--I--J--S   <-- br1 (HEAD)
         \
          K--L   [abandoned]

Commits K-L still exist, but with no name, we (humans) can't find them. We probably don't remember the hash IDs, so we can't give Git the hash ID it needs to find commit L or commit K. There's no forwards link from H to K—only a backwards link from K to H—so there's no way to go from H to K. The name br1 points to S now, so we can find S and thus J and I and H and so on; that's the history we see with git log. It's as though commits K-L are just gone.¹

¹Eventually, in a typical Git repository, Git will realize that these abandoned commits should be removed, and will remove them for real. At that point, even having the hash ID won't help, as the commits really are gone. Some repositories, such as those on GitHub, never clean up like this, and having the hash ID means you can continue to get at these old commits forever—but you still need to find the hash ID, somehow.

Your situation

In your setup, you have done this:

...--G--H   <-- master, origin/master
         \
          I--J   <-- feature/one (HEAD)

(The name origin/master is your Git's way of remembering some other Git repository's master: a remote-tracking name. In this case the other Git repository is the central one on Bitbucket.) You send commits I-J to the Bitbucket repository, having them create the name feature/one in the process. Then you switch back to your existing commit H to make feature/two:

...--G--H   <-- feature/two (HEAD), master, origin/master
         \
          I--J   <-- feature/one, origin/feature/one

The snapshots in your commits I-J are in your repository, but not in your working tree, because you're working with commit H again.

What you currently do is avoid making feature/two yet:

...--G--H   <-- master (HEAD), origin/master
         \
          I--J   <-- feature/one, origin/feature/one

You now wait for someone to use Git's squash-"merge" feature to make a new commit (we'll call it S1) to add on to H:

          S1   <-- origin/master
         /
...--G--H   <-- master (HEAD)
         \
          I--J   <-- feature/one, origin/feature/one

You can now move your master forward to point to S1:

          S1   <-- master (HEAD), origin/master
         /
...--G--H
         \
          I--J   <-- feature/one, origin/feature/one

and then create your feature/two; this works well even if they (whoever "they" are) first, or immediately after, squash-merge some other feature(s):

          S3--S1--S4   <-- origin/master
         /
...--G--H   <-- master (HEAD)
         \
          I--J   <-- feature/one, origin/feature/one

where S3 and S4 are from feature/three and feature/four (which you may never pick up if they're added to, but then deleted from, origin before you have a chance to see them and create origin/feature/three etc in your own repository). This all leaves you safe to start your feature/two from commit S4.

In fact you don't need the name master in your repository at all, so let's just delete that by switching to feature/one:

          S3--S1--S4   <-- origin/master
         /
...--G--H
         \
          I--J   <-- feature/one (HEAD), origin/feature/one

We then create and switch to feature/two from S4 with:

git switch -c feature/two --no-track origin/master

          S3--S1--S4   <-- feature/two (HEAD), origin/master
         /
...--G--H
         \
          I--J   <-- feature/one, origin/feature/one

You're now free to delete feature/one, and whoever controls the Bitbucket repository can delete feature/one over there, so that git fetch --prune will delete origin/feature/one:

...--G--H--S3--S1--S4   <-- feature/two (HEAD), origin/master
         \
          I--J   [abandoned]

and now you can make commits K and L to create feature/two starting from commit S4:

                      K--L   <-- feature/two (HEAD)
                     /
...--G--H--S3--S1--S4   <-- origin/master

What you'd like to do

What you might like to do, though, is start working on feature/two knowing that feature/one will eventually be squash-merged into origin/master at some point.

You have a choice of how to achieve this. Let's assume for now that you choose not to make your own squash-merge: that, instead, you just start feature/two from commit J, like this:

...--G--H   <-- origin/master
         \
          I--J   <-- feature/one, feature/two (HEAD)

You now make commits K and L:

...--G--H   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   <-- feature/two (HEAD)

Now, here's the important part: you either don't give commits K-L to anyone else at all (so that nobody else has them), or you tell anyone who does get commits K-L that these are temporary commits and they must not assume that these two are the final versions. This way nobody else depends on the existence of commits K-L.

Now suppose that commit J is squashed into origin/master as S1 immediately after commit H:

...--G--H--S1   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   <-- feature/two (HEAD)

The snapshot in commit S1 will exactly match the snapshot in commit J. (Think about how Git performs merges: Git has to combine work since commit H, up to commit H for the one side, and up to commit J for the other. Well, what "work" was done from commit H to commit H? What happens when we add the work done from commit H to commit J, to the nothing-at-all done from H to H? What is x plus zero, or zero plus x?)

So if this is what happens, all we have to do is somehow change commit K to extend out from commit S1. We literally can't do that—no part of any commit can change, including its metadata—but we can make an almost exact copy of K, which we can call K', where the only difference from K is that its parent is S1 instead of J:

              K'  <-- new (HEAD)
             /
...--G--H--S1   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   <-- feature/two

Having done that, we can make an almost-exact-copy of L whose parent is K':

              K'-L'  <-- new (HEAD)
             /
...--G--H--S1   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   <-- feature/two

The new branch, called new here, contains exactly those commits we'd like someone to squash-merge. So all we have to do now is make the name feature/two point to commit L':

              K'-L'  <-- feature/two, new (HEAD)
             /
...--G--H--S1   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   [abandoned]

Since no one else has commits K-L, nobody will ever know that we switched from K-L to K'-L'.

You can now also delete your feature/one name: you were using it, until now, to remember which commits were only on feature/two, and which were on both feature/two and feature/one. And you can switch to feature/two again and discard the name new:

              K'-L'  <-- feature/two (HEAD)
             /
...--G--H--S1   <-- origin/master

But suppose S1 gets S3 and/or S4 tacked on too. Then what you want looks like this:

                      K'-L'  <-- feature/two (HEAD)
                     /
...--G--H--S3--S1--S4   <-- origin/master

with one other change than just parent of K' is S4: in particular, you need the snapshot in K' to be the effect of applying the same changes that K makes to I, but applied to S4 instead of to J.

Achieving what you want

The Git command that takes commit K and copies it to new-and-improved commit K' is git cherry-pick. You can think about cherry-pick as meaning find the difference from the parent commit (in this case J vs K), then apply that same change to some other commit (in this case S4). Internally, Git actually does the cherry-pick using the merge engine, but making an ordinary non-merge commit in the end. This means cherry-picking can produce merge conflicts. If so, you resolve them the same way you resolve any merge conflicts: do not fear merge conflicts, just learn to resolve them.

Now, the issue with using git cherry-pick here is that it only does one commit. As we can see in the previous section, we may have multiple commits to copy, and we also need to tell Git to move a branch name at the end. The command that does all of this at once is git rebase.

The rebase command is pretty complicated and has a lot of options, but for your purposes here, the command you want is a simple git rebase --onto. To perform git rebase --onto, you should:

Check out / switch to the branch, if necessary: git switch feature/two, if you're not already there.
Figure out which commits need copying and which don't. The ones that shouldn't be copied are the ones from the earlier feature branch, which are now squashed in: those are the ones found by the name of your other feature branch, in this case feature/one.
Tell Git where the copies go.

You achieve part 3 with the --onto option to git rebase, and you achieve part 2 with the remaining git rebase argument, so your command here is:

git rebase --onto origin/master feature/one

That is, take the current branch's commits—all of them except any that are reachable from feature/one—and copy them to come after the commit found by the name origin/master. Since that finds S4, we'll go from this:

...--G--H--S3--S1--S4   <-- origin/master
         \
          I--J   <-- feature/one
              \
               K--L   <-- feature/two (HEAD)

to this:

                      K'-L'  <-- feature/two (HEAD)
                     /
...--G--H--S3--S1--S4   <-- origin/master
         \
          I--J   <-- feature/one

all with the one git rebase --onto command. If it has some conflicts during the copying process, resolve them and run git rebase --continue; repeat for each commit that gets conflicts. If there are no conflicts, you're done (aside from testing of course).