23

I ran I to an interesting thing today and that has me wondering how git handles local branches. So I was running into some issues on my repository and deleted the local folder (which went to recycle bin) and recloned.(extreme maybe) After I did I realized I deleted a local branch that I never pushed because it was a personal side project. I panicked for a moment and decided to restore the folder from the recycle bin and put it in a different location and see if I could get the few files I worked on back and push it to remote. At first I just tried to search using the file explorer couldn't find the files (I was sad) then I remembered I was in a different branch before I deleted so I redirected gitbash into the new location for the project old repo and 'git branch' behold all my old branches appear (git magic #1). So I checkout the branch in question and the file explorer happens to be open to the location I was expecting the file to be and behold the file magically appears (git magic #2)

That leads me to wonder where does git store all the data for local branches. I know there is a hidden folder but I searched that for the file and it didn't show up is it compressed and renamed in there?

ProgramKitkat
  • 405
  • 1
  • 4
  • 9

1 Answers1

30

There's a hidden folder .git in the project root directory (it gets created when you run git init or otherwise initialize a repository), in which git stores all the "magical bits". It's not extremely magical - there's a HEAD text file which says where the "head" is at (e.g. currently checked out branch or commit), there's refs directory with more directories that have files corresponding to your local and remote branch names - each of these files being just a text file with commit SHA (like a dictionary, when you say "check out master branch!" git will go and look for the corresponding file, read what commit it is - and check out that commit).

Commits refer to "objects", conveniently in objects/ directory. That dir actually contains a bunch more directories with two-letter names - which are first two letters SHA hash - which together with the file name inside directory makes the full hash (of a commit, tree or blob). Inside that two-letter dir, there are actual "objects" (git magic!). Objects can be of of type "blob" or "tree", former corresponding to files and latter (loosely) to directories. Read about git objects in the docs - it's an easy read that also gives you some tools to look at individual objects.

So, if in .git/refs/heads/master is a text file with contents a2789da8f918ef26c90e51d05de5723e5ad543a4 - that means master is at that commit. The "state" of your project for that commit is stored in .git/objects/a2/789da8f918ef26c90e51d05de5723e5ad543a4 - which is tree object for project directory, listing SHAs for all the files & directories in the project dir at that moment.

So, you can't really browse around and find actual files from different branches - git doesn't think about things in that way, it thinks about things as lists of files & dirs. When you change a file and commit it, it creates a new "object" for contents of that file, and new tree objects for parent directories (with updated SHA for your updated file) and writes a commit object (tree object listing all the files & folders in project dir). Or at least that's my understanding, more or less.

Hope that helps demystify it a bit!


2017-12-21 edit: updated my old answer per comment from @Herman. For what it's worth, this answer here might be a bit "too much information without enough context".

The shorter answer - git stores all the data in .git directory in project root, it stores references to state of your project folder rather than separate copies of files.

For sake of posterity, if anyone stumbles upon this answer in the future - I highly recommend this course:

https://app.pluralsight.com/library/courses/how-git-works/table-of-contents (you can find offers for free trial for pluralsight and get a lot of value out of your trial).

apprenticeDev
  • 8,029
  • 3
  • 23
  • 25
  • Commits do not contain diffs. You can "pretty print" the contents of a commit: try, e.g., `git cat-file -p HEAD` to see what's in the `HEAD` commit. The `tree` line gives the ID of a tree object, which is also in the `objects/` directory, and you can `git cat-file -p` the tree object to see what's in that, and so on. All objects are stored by their SHA-1 ID. – torek Nov 05 '15 at 22:03
  • Thanks @torek, updated the answer to reflect what I learned from the docs. – apprenticeDev Nov 05 '15 at 22:39
  • 1
    Some mistakes still present: the two-character names do not refer to a commit and the files inside to blobs and trees. The two-character names together with the 38-character names of the files inside form 40-character SHA-1 IDs of objects, which can be commits, blobs or trees. If it's a commit, it will contain a reference to the root tree SHA-1 ID, a reference to the parent commit's SHA-1 ID, author/committer info and commit message. – herman Dec 20 '17 at 14:09
  • Thanks @herman, updated the answer. 2 years ago I was excited to share my grasp of it - but now I've grown more bitter and kind of feel this answer might be a bit TMI. Added a paragraph to the bottom of the answer for sake of future generations. – apprenticeDev Dec 21 '17 at 19:42
  • When editing files in a new branch, we are supposed to create files inside of the hidden .git directory? Really? – David Spector Jan 27 '19 at 14:25
  • @DavidSpector no. Don't modify anything in `.git` directory manually. The original question was "where's the stuff?" and in my answer I elaborated on this; however you should let git manage all the things in it's own directory - just interact with it with git commands :-) – apprenticeDev Jan 28 '19 at 23:15
  • My confusion had to do with the fact that in GitHub one can view the files inside branches. The view looks like a directory. I have discovered that it is not a real directory, but it does look like one. – David Spector Jan 29 '19 at 15:26