1

Maybe this might be noob question but this is bothering me. Any help will be greatly appreciated.

Scenario: 1] Clone a project having some files say file1, file2,file3

2] Create a new branch, say branch1

3] Now if I delete file1 in my branch and make some other changes say to file2 and add a new file file4

4] Do a pull request

Question: Will file1 get deleted from main after PR is approved from reviewer?

user9552213
  • 33
  • 1
  • 5

1 Answers1

0

Files don't exist in branches in the first place,1 so the question is ill-formed. You can, however, test this yourself, as phd suggests in a comment:

  • make two new branch names locally, test1 and test2 for instance;
  • in one of the new branches (test1), delete some file(s) and commit;
  • check out the other branch and run git merge --no-ff test1 and observe the result.

Here is the command sequence for testing purposes; this assumes the test file to be removed is test.ext. Change that as needed.

git switch main
git branch test1 && git branch test2
git switch test1
git rm test.ext && git commit -m 'remove file for testing'
git switch test2
git merge --no-ff test1

The file will be gone from your working tree. Is the file "in" branch test2? What do we mean by "branch test2"?

If we avoid the problematic word branch, we can talk instead about commits. Git is really all about commits. File are stored in commits (not in branches, which are relatively unimportant; only commits are truly important to Git). Branch names like test1 and test2 merely serve to find particular commits. Git finds commits by raw hash ID: those big ugly strings like 6cc022f7696a45f5b3814d7659a7b0f16436b4bf. Humans are no good at these: they look like random junk. (They aren't random at all, but they are deliberately random-looking and unpredictable. They are poisonous to living creatures; only computers can digest them and remain healthy.2)

Since commits also serve to provide hash IDs to find other (earlier) commits—this is how history works, in a Git repository, and hence part of what "branch" sometimes means—it suffices for Git to be able to find the newest commit "on" some branch, in one swell foop, from which it can find all earlier commits that are also "on" that branch (using that particular definition of "branch"). A branch name like test1 here gives Git the ability to get the latest commit: that's literally how the name is defined (and that in turn is part of what "branch" sometimes means).

What we find with the above sequence of commands is that:

  • File test.ext does exist in the commit that git switch main extracts.
  • File test.ext does not exist in the commit that git switch test1 extracts.
  • File test.ext also does not exist in the commit that git switch test2 extracts.

Since GitHub will generally do a PR-merge3 using the equivalent of git merge --no-ff, that's what we did to make the commit that is now the tip of branch test2. So you can see that removing the file will result in a removed file.

The real key concepts here, though, are these:

  • The files are stored in commits.
  • The commits are numbered (by hash IDs).
  • The stored files inside any given commit are the way they were, at the time you (or whoever) made the commit, forever.

To use a commit, you "check it out" with git checkout or git switch or git switch --detach or similar. This extracts the files from that commit. So a file, once committed, is there forever. It's just not necessarily in any other commit. If it's in five commits, it's in those five commits forever. If it's removed as of the sixth commit, it's not in the sixth commit—forever! And future commits start with previous commits, so the set of files in future commits will lack that file too, unless and until you put it back and commit that, and then it will be in that commit (forever) and in future commits unless and until someone changes it or removes it.

Every commit saves every file for all time. To keep repositories from growing bloated and becoming unusable, the files stored in a commit, in a Git repository, are (a) compressed and (b) de-duplicated. If the content of some file test.ext in commit a123456 exactly matches the content of some other file other.blargh in commit b987654, there's really only one copy of that file (under two different names in this case). Git can do this because every copy is permanently read-only: there's no danger of the saved test.ext ever changing. If you make a new commit c0ffee1 that has a different content test.ext, Git will save that different-content test.ext as a new and different file (using the same old name though), and later, git diff will show you what's different between the old a123456 file named test.ext and the c0ffee1 file named test.ext.

So, even the simpler question "does removing a file remove it from the branch" is poorly defined, because "the branch" is poorly defined. Removing a file removes it from some set of commits, starting with the next one you make. Each commit contains whatever files it contains, with whatever contents they have, and as long as that commit exists, those files with that content also exist.

Running git merge combines work, and if that work includes "remove a file", git merge will try to combine that work with whatever work someone else did with that file. (The only "remove a file" work Git can combine successfully on its own, though, is also "remove a file". So if one person did a "remove file" operation, and the other person did "change file content" on that same-named file, Git will declare a conflict here, and make you—the human who understands the software—figure out what to do.)


1The real problem here, if we dig deeper, is the word branch, which is poorly defined. See also What exactly do we mean by "branch"? Nonetheless, using most of the common definitions of branch, we don't get a sensible answer for whether some file is "in" any given branch, because it very often both is and isn't, at the same time.

2I'm only somewhat kidding here: trying to memorize Git hash IDs is probably harmful to your mental health. There is one for the empty tree that git rev-parse really should be able to spit out on command, but isn't, so that you have to use git hash-object instead, to avoid memorizing it. Soon, there will be two, because of SHA-256.

3While Pull Requests are a GitHub-specific item (e.g., GitLab has "Merge Requests" instead), they share a lot with the core Git here. The big green MERGE button on GitHub does a Git-style merge. You can switch this to SQUASH AND MERGE or REBASE AND MERGE, though, and those are different, so don't just assume it's always the same.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks for such elaborative answer. Even though I did not understood this completely but it definitely gave me some direction to look further and enhance my knowledge on concept. – user9552213 Feb 17 '22 at 07:03