57

it seems like I have to learn to use git. Which probably is a good thing (TM). However reading online guides and man-pages, I just cannot get my head around the terminology. Everything is always defined in terms of themselves or other unexplained terms (do a "man git" and you see what I mean).

So, is there a more DAG-alike structure of definitions of terms, including some of the following (all taken from the git man page(s)!). Maybe using a file system as a starting point, and not assuming the reader is well versed in svn (which I am not).

  • repo
  • repository
  • a git
  • "the git"
  • index
  • clone
  • commit
  • branch
  • tree
  • upstream
  • a head
  • HEAD
  • version
  • tag
  • archive
  • patch
  • submission
  • changeset
  • stash
  • archive
  • object
  • module
  • submodule
  • refspec
  • a history

While I can find explanations for some, they usually are in terms of the other. Also some others terms I do know from other contexts (like a UNIX diff). However some other I thought I knew...

I have gathered that there are repositories (similar to gits? and/or trees? upstream?), which you copy (clone? branch?) to get the files physically to your hard drive. Then there are branches (similar to changesets?), tags and commits (similar to patches?), but their distinction is not clear. What files do what modify? What makes my files stay local and what might (heaven forbid) submit my code to teh internets?

What is the recommended way to work, when it comes to branches, tags and commits -- so it is easy to swap between versions, and to import updates from publically available gits.

//T, biting his tongue to control his frustration...

The Apa
  • 863
  • 1
  • 9
  • 6

6 Answers6

92

Here's an attempt to complete your glossary (from the top of my head, trying to use my own words):

  • repo, repository: This is your object database were your history and configuration is stored. May contain several branches. Often it contains a worktree too.

  • a git, "the git": never heard of, sorry. "the git" probably describes the software itself, but I'm not sure

  • index, staging area: This is a 'cache' between your worktree and your repository. You can add changes to the index and build your next commit step by step. When your index content is to your likes you can create a commit from it. Also used to keep information during failed merges (your side, their side and current state)

  • clone: A clone of a repository ("just another repository") or the act of doing so ("to clone a repository (creates a new clone)")

  • commit: A state of your project at a certain time. Contains a pointer to its parent commit (in case of a merge: multiple parents) and a pointer to the directory structure at this point in time.

  • branch: A different line of development. A branch in git is just a "label" which points to a commit. You can get the full history through the parent pointers. A branch by default is only local to your repository.

  • tree: Basically speaking a directory. It's just a list of files (blobs) and subdirectories (trees). (The list may also contain commits in case you use submodules, but that's an advanced topic)

  • upstream: After cloning a repository you often call that "original" repository "upstream". In git it's aliased to origin

  • a head: The top commit of a branch (commit the label points to)

  • HEAD: A symbolic name to describe the currently checked out commit. Often the topmost commit

  • version: Might be the same as a commit. Could also mean a released version of your project.

  • tag: A descriptive name given to one of your commits (or trees, or blobs). Can also contain a message (eg. changelog). Tags can be cryptographically signed with GPG.

  • archive: An simple archive (.tar, .zip), nothing special wrt git.

  • patch: A commit exported to text format. Can be sent by email and applied by other users. Contains the original auther, commit message and file differences

  • submission: no idea. Submitting a patch to a project maybe?

  • changeset: Synonym for "commit"

  • stash: Git allows you to "stash away" changes. This gives you a clean working tree without any changes. Later they can be "popped" to be brought back. This can be a life saver if you need to temporarily work on an unrelated change (eg. time critical bug fix)

  • object: can be one of commit, tree, blob, tag. An object has associated its SHA1 hash by which it is referenced (the commit with id deadbeaf, the tree decaf). The hash is identical between all repositories that share the same object. It also garuantees the integrity of a repository: you cannot change past commits without changing the hashes of all child commits.

  • (module,) submodule: A repository included in another repository (eg. external library). Advanced stuff.

  • revspec: A revspec (or revparse expression) describes a certain git object or a set of commits through what is called the extended SHA1 syntax (eg. HEAD, master~4^2, origin/master..HEAD, deadbeaf^!, …)

  • refspec: A refspec is pattern describing the mapping to be done between remote and local references during Fetch or Push operations

  • history: Describes all ancestor commits prior to a commit going back to the first commit.


Things you didn't mention, but are probably good to know:

Everything you do is local to your repository (either created by git init or git clone git://url.com/another/repo.git). There are only a few commands in git that interact with other repositories (a.k.a. teh interwebz), including clone, fetch, pull, push.

Push & pull are used to syncronize repositories. Pull fetches objects from another repository and merges them with your current branch. Push is used to take your changes and push them to another repository. You cannot push single commits or changes, you only can push a commit including its complete history.

A single repository can contain multiple branches but does not need to. The default branch in git is called master. You can create as many branches as you want, merging is a piece of cake with git. Branches are local until you run git push origin <branch>.

A commit describes a complete state of the project. Those states can be compared to one another, which produces a "diff" (git diff origin/master master = see differences between origin/master and master)

Git is pretty powerful when it comes to preparing your commits. The key ingredient here is the "index" (or "staging area"). You can add single changes to the index (using git add) until you think the index looks good. git commit fires up your text editor and you need to provide a commit message (why and how did you make that change); after entering your commit message git will create a new commit – containing the contents of the index – on top of the previous commit (the parent pointer is the SHA1 of the previous commit).

nulltoken
  • 64,429
  • 20
  • 138
  • 130
knittl
  • 246,190
  • 53
  • 318
  • 364
16

Git comes with documentation for exactly what you are looking for.

$ git help glossary
Benjamin Bannier
  • 55,163
  • 11
  • 60
  • 80
  • 3
    it is on the intertub too! http://www.kernel.org/pub/software/scm/git/docs/gitglossary.html – J-16 SDiZ Aug 16 '11 at 10:34
  • 1
    I wouldn't say this is *exactly* what he's looking for - I was going to suggest the same glossary since I've used it myself many a time, but the poster is asking for simple explanations in easy-to-understand terms for a git/vcs newbie, and when I looked up a few of the terms before suggesting the glossary, I found some of them convoluted even though I'm quite familiar with git... – johnny Aug 16 '11 at 14:52
  • I'd disagree, personally - when I was first learning to use git, I found that the glossary was one of the clearest and easiest to understand references. Thanks in part to the dense hyperlinking, it's easy to start with any definition and understand it in detail. – Mark Longair Aug 16 '11 at 15:48
9

I found this (free) book very useful when learning how to use git: http://progit.org/. The book exists in printed form as well.

I think the quickest way to learn git is probably to pick up a book or tutorial which teaches you the basic concepts and terms.

Jonatan
  • 2,734
  • 2
  • 22
  • 33
  • Progit is a fantastic resource and many of the other excellent git resources reference it directly as well... I was familiar with cvs/svn prior to using git, but I think progit does a great job of explaining those topics as well, and if it doesn't, I'm sure the creators/maintainers would like to hear suggestions... – johnny Aug 16 '11 at 14:56
3

Another good resource for learning Git is Edgecase's Git Immersion. Trying to learn Git through the man pages is probably very difficult, there is a short, steep learning curve that has to be overcome first. You need to be introduced to the concept of a DCVS (Distributed Version Control System) first.

Progit as recommended by @fulhack is also very good.

I can also strongly recommend Think Like A Git. The explanation of rebase here is worth its weight in gold.

Daniel Lee
  • 7,709
  • 2
  • 48
  • 57
  • Git also produces some very hard to understand problems, especially if you do something wrong. I think learning the usual cases is the hardest part of Git because at first you are completely stuck. – Makis Aug 16 '11 at 09:59
  • But once you're past that then Git is brilliant. It can do pretty much anything especially when compared to SVN and TFS. And combined with GitHub it is programmer nirvana. – Daniel Lee Aug 16 '11 at 10:14
  • Of course. I love Git, but it took me a while to get going. Thankfully I sat next a Git expert because a few of the things I stumbled upon were very cryptic. – Makis Aug 16 '11 at 10:28
  • 1
    I liked the git immersion story! That was helpful indeed. – The Apa Aug 17 '11 at 10:06
  • I found ProGit first but it would have been easier if I'd done Git Immersion first and then filled in the blanks with Progit.Happy to hear it helped you out. – Daniel Lee Aug 17 '11 at 11:26
2

The best I have found for understanding git is The Git Parable

Imagine that you have a computer that has nothing on it but a text editor and a few file system commands. Now imagine that you have decided to write a large software program on this system. Because you’re a responsible software developer, you decide that you need to invent some sort of method for keeping track of versions of your software so that you can retrieve code that you previously changed or deleted. What follows is a story about how you might design one such version control system (VCS) and the reasoning behind those design choices...

Benjol
  • 63,995
  • 54
  • 186
  • 268
1

I think you might like this article: Git for Computer Scientists

And another important aspect to understand when using git is the workflow. Read this wonderful blog post: Git branching model

yasouser
  • 5,113
  • 2
  • 27
  • 41