863

I keep hearing people say they're forking code in Git. Git "fork" sounds suspiciously like Git "clone" plus some (meaningless) psychological willingness to forgo future merges. There is no fork command in Git, right?

GitHub makes forks a little more real by stapling correspondence onto it. That is, you press the fork button and later, when you press the pull request button, the system is smart enough to email the owner. Hence, it's a little bit of a dance around repository ownership and permissions.

Yes/No? Any angst over GitHub extending Git in this direction? Or any rumors of Git absorbing the functionality?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Brian
  • 8,791
  • 3
  • 15
  • 9
  • 12
    Yeah, it is just a type of clone which is tracked by the github database. – Paŭlo Ebermann Jun 09 '11 at 01:21
  • 16
    Doesn't GitHub do something special to avoid doubling the storage requirements (on GitHub's own servers)? – Keith Thompson May 10 '12 at 03:44
  • 20
    Not mentioned yet: Deleting a private repo deletes all its forks. Deleting a public repo keeps the forks but promotes one fork to be the new parent repo. If your boss makes your public repo private, it breaks all the existing forks and you won't be able to make pull requests from them to the private repo. https://help.github.com/articles/what-happens-to-forks-when-a-repository-is-deleted-or-changes-visibility/ – Plato Feb 25 '16 at 00:21
  • I believe (without proof since GitHub do not show this to us) that the actual mechanism here is Git's "alternates". In other words, the fork is a mirror clone with `--reference` used. Exactly how public repos and deletions are handled is not at all clear (move alternates to randomly chosen promoted repo? point all forks to some common alternate that's not part of the original fork?) but the use of alternates explains various observable behaviors. – torek Nov 01 '19 at 07:41

10 Answers10

971

Fork, in the GitHub context, doesn't extend Git.
It only allows clone on the server side.

When you clone a GitHub repository on your local workstation, you cannot contribute back to the upstream repository unless you are explicitly declared as "contributor". That's because your clone is a separate instance of that project. If you want to contribute to the project, you can use forking to do it, in the following way:

  • clone that GitHub repository on your GitHub account (that is the "fork" part, a clone on the server side)
  • contribute commits to that GitHub repository (it is in your own GitHub account, so you have every right to push to it)
  • signal any interesting contribution back to the original GitHub repository (that is the "pull request" part by way of the changes you made on your own GitHub repository)

Check also "Collaborative GitHub Workflow".

If you want to keep a link with the original repository (also called upstream), you need to add a remote referring that original repository.
See "What is the difference between origin and upstream on GitHub?"

fork and upstream

And with Git 2.20 (Q4 2018) and more, fetching from fork is more efficient, with delta islands.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 7
    "When you are cloning a GitHub repo on your local workstation, you cannot contribute back to the upstream repo unless you are explicitly declared as "contributor"." --- Is this not true with "forking"? Please explain. – chharvey Aug 03 '13 at 18:16
  • 65
    @TestSubject528491 no, with a fork, that means you are cloning the upstream repo as *your own* repo on the GitHub server side. Then you can locally clone that new "fork" repo on your computer and freely push back on it, since you are the creator and owner of that fork. – VonC Aug 03 '13 at 19:00
  • 14
    To me, the key point is that you can't submit a PR from your local copy _unless you're declared to be a contributor_. I'm so used to submitting PRs from my local repo, but that's because I'm always marked as a contributor. If you think about it, to submit a PR you have to push a branch to the remote repo and then create the PR. I guess it makes sense if you don't want random people creating branches on your repo. And that you'd prefer them to fork it and submit PRs that way instead. – Adam Zerner Feb 22 '16 at 01:18
  • I've seen the second "upstream" remote approach elsewhere, but isn't it more straightforward to pull directly from "GitHub - Original" to "GitHub - Fork"? The second remote approach didn't seem to work in my Eclipse and eGit setup, failing to push to my "GitHub - Fork" repo (nothing to push). – William T. Mallard Sep 28 '16 at 22:05
  • Never mind, the info here [link](http://www.eqqon.com/index.php/Collaborative_Github_Workflow) gave me some insight. As a contributor it makes sense to pull from "GitHub - Original" to "GitHub - Fork", then from " - Fork" to local machine, but if you're the owner you probably want to pull directly from my " - Fork" first to review, run tests, etc. before pushing to " - Original". – William T. Mallard Sep 28 '16 at 22:29
  • "When you are cloning a GitHub repo on your local workstation, you cannot contribute back to the upstream repo unless you are explicitly declared as "contributor" ---- this sentence makes it sound like changes to a fork can freely be pushed back to the original without being a contributor. – Darko Maksimovic Aug 16 '17 at 08:37
  • @DarkoMaksimovic That sentence is followed by "So that clone (to your local workstation) isn't a "fork". It is just a clone." That should be clear enough. Considering you own a fork (since it is your repo), you can push back to it. – VonC Aug 16 '17 at 08:41
  • @VonC That sentence also makes it sound like I stated. Let me be more clear: I'm not saying that it sounds like you can freely push changes to your fork, because you can, and that would be ok. Instead, it sounds like - if you fork a repo and add changes to that fork, then those changes can freely be pushed back to the original repo. Just read it with this in mind and you'll see it sounds like it. I would suggest that, instead of the second sentence you quoted, you add something like "... You can only push changes to your own fork, not to the original repo unless you're contributor" – Darko Maksimovic Aug 17 '17 at 09:14
  • @DarkoMaksimovic OK, I see: can you edit directly the answer: I will review the edit and approve it :) – VonC Aug 17 '17 at 09:16
  • @DarkoMaksimovic Yes, thank you. I approved your edit immediately. – VonC Aug 20 '17 at 06:22
  • 1
    Is the fork a `clone --bare` or `clone --mirror` ? – theonlygusti Sep 25 '19 at 08:22
  • 1
    @theonlygusti mirror on the server side (GitHub). – VonC Sep 25 '19 at 10:42
  • The original diagram in "[What is the difference between origin and upstream on GitHub?](https://stackoverflow.com/questions/9257533/what-is-the-difference-between-origin-and-upstream-in-github/9257901#9257901)" correctly displays a unidirectional arrow from "GitHub - Original" to "Local Machine" :) – Ricardo Oct 06 '19 at 01:23
  • @Ricardo Thank you. I have fixed the image in this answer. – VonC Oct 06 '19 at 04:02
144

I keep hearing people say they're forking code in git. Git "fork" sounds suspiciously like git "clone" plus some (meaningless) psychological willingness to forgo future merges. There is no fork command in git, right?

"Forking" is a concept, not a command specifically supported by any version control system.

The simplest kind of forking is synonymous with branching. Every time you create a branch, regardless of your VCS, you've "forked". These forks are usually pretty easy to merge back together.

The kind of fork you're talking about, where a separate party takes a complete copy of the code and walks away, necessarily happens outside the VCS in a centralized system like Subversion. A distributed VCS like Git has much better support for forking the entire codebase and effectively starting a new project.

Git (not GitHub) natively supports "forking" an entire repo (ie, cloning it) in a couple of ways:

  • when you clone, a remote called origin is created for you
  • by default all the branches in the clone will track their origin equivalents
  • fetching and merging changes from the original project you forked from is trivially easy

Git makes contributing changes back to the source of the fork as simple as asking someone from the original project to pull from you, or requesting write access to push changes back yourself. This is the part that GitHub makes easier, and standardizes.

Any angst over Github extending git in this direction? Or any rumors of git absorbing the functionality?

There is no angst because your assumption is wrong. GitHub "extends" the forking functionality of Git with a nice GUI and a standardized way of issuing pull requests, but it doesn't add the functionality to Git. The concept of full-repo-forking is baked right into distributed version control at a fundamental level. You could abandon GitHub at any point and still continue to push/pull projects you've "forked".

Al Sweigart
  • 11,566
  • 10
  • 64
  • 92
user229044
  • 232,980
  • 40
  • 330
  • 338
  • 6
    Thanks for your excellent answer. I just want to clarify, this means, **outside the context of github** I could clone some `X project` on my machine. If I make changes in my local and **don't have** write access to origin, I will email the author of the project to request a pull. He will make a remote called _gideon_ which will be a url to my local clone, and he can pull, right? – gideon Sep 16 '13 at 07:51
  • 1
    If you want to contribute your changes to a project you can either save them into files e.g. using git format-patch and attach them to an email to someone who has that write access, or you can obtain your own hosting, push your work to that and send the URL in an email e.g. using the git request-pull command. Repos on workstations are not usually direclty accessible online. – bdsl Jan 25 '16 at 11:11
  • But yes, if your workstation happens to be accessible over the internet to the author of the project then you can simply send the URL to them and they can add it as a remote and pull from it. – bdsl Jan 25 '16 at 11:12
  • 1
    Re: angst, the only such for me is that there's no link or button to click to create a pull-from-my-repo's-perspective button where GitHub tells you you're 50 commits behind. No biggie now that I know they're using the term "Pull Request" to also include requests for pulling from the upstream to your GitHub fork. Git is hard. – William T. Mallard Sep 28 '16 at 22:10
86

Yes, fork is a clone. It emerged because, you cannot push to others' copies without their permission. They make a copy of it for you (fork), where you will have write permission as well.

In the future if the actual owner or others users with a fork like your changes they can pull it back to their own repository. Alternatively you can send them a "pull-request".

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ssapkota
  • 3,262
  • 19
  • 30
  • 1
    Can I simply clone the repository to my local machine, create a branch, then submit a pull request to the original owner? It seems redundant to have multiple copies of repos hosted all over GitHub, just to facilitate code updates. – Casey Sep 01 '15 at 12:57
  • 5
    @Casey You can only send a pull request through GitHub from GitHub itself and you can only send a GitHub pull request from a branch that exists on GitHub. If you are not a collaborator on the Repository in question, there is no way for you to create a branch from which you can initiate a GitHub pull request. Nothing stops you from doing it via email the old fashioned way, but GitHub plays no part in that. – Beau Simensen Sep 06 '15 at 20:24
  • 3
    @Casey, a _reason_ is that normally others do not have URL access to your workstation. The GitHub `fork` means there is a copy of your work on the GitHub server, that you can `push` to and which others do have URL access to so they can `pull`. The `pull request` is just a standard way to getting the URL for your copy (up on GitHub) to them so they can easily pull it into the their repository. – Jesse Chisholm Jun 13 '16 at 23:19
  • This should be the correct/accepted answer I believe. Imagine a mess in a scene where a team of 15-20 developers creating branches and pushing to origin versus 15-20 developers having their own copy of same repository and making as many branches and doing changes and pushing it back. Then Author of original repository can pull only changes he/she wants. – Kishor Pawar Jul 19 '16 at 07:31
39

"Fork" in this context means "Make a copy of their code so that I can add my own modifications". There's not much else to say. Every clone is essentially a fork, and it's up to the original to decide whether to pull the changes from the fork.

Daenyth
  • 35,856
  • 13
  • 85
  • 124
  • 5
    In specific: "Make a copy of their code `on the GitHub server` so that I can add my own modifications `and others can have URL access to my version`". Most local workstations do not offer URL access for anyone to be able to pull. But if you push to your fork on the server, then they can have the URL for the pull. – Jesse Chisholm Jun 13 '16 at 23:21
  • The question is not about forking is in general, but about the GitHub forking specifically. – reinierpost Jun 08 '17 at 08:46
27

Cloning involves making a copy of the git repository to a local machine, while forking is cloning the repository into another repository. Cloning is for personal use only (although future merges may occur), but with forking you are copying and opening a new possible project path

Sam Johnson
  • 943
  • 1
  • 13
  • 19
12

There is a misunderstanding here with respect to what a "fork" is. A fork is in fact nothing more than a set of per-user branches. When you push to a fork you actually do push to the original repository, because that is the ONLY repository.

You can try this out by pushing to a fork, noting the commit and then going to the original repository and using the commit ID, you'll see that the commit is "in" the original repository.

This makes a lot of sense, but it is far from obvious (I only discovered this accidentally recently).

When John forks repository SuperProject what seems to actually happen is that all branches in the source repository are replicated with a name like "John.master", "John.new_gui_project", etc.

GitHub "hides" the "John." from us and gives us the illusion we have our own "copy" of the repository on GitHub, but we don't and nor is one even needed.

So my fork's branch "master" is actually named "Korporal.master", but the GitHub UI never reveals this, showing me only "master".

This is pretty much what I think goes on under the hood anyway based on stuff I've been doing recently and when you ponder it, is very good design.

For this reason I think it would be very easy for Microsoft to implement Git forks in their Visual Studio Team Services offering.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Hugh
  • 748
  • 9
  • 9
  • 1
    Dear Hugh, half of your response is actually incorrect -- a fork is a clone of a whole repository from one user account to another user account, together with all the branches and history. When you commit to the fork, nothing changes in the original repository from which you forked. But besides these few misunderstandings on your part regarding what a "fork" is, there is now some good news: Visual Studio Team services include now a "Fork" functionality. ;) – Sorin Postelnicu Jan 11 '18 at 23:01
  • 1
    @SorinPostelnicu source? I'm inclined to believe Hugh here due to personal experience of forks behaving in ways that are inconstant with them being a simple clone of the repository. For example, when upstream is deleted, forks are deleted (as was mentioned in a comment on OP's question) and sometimes upstream has wound up merging things into branches of my forks when accepting a pull request, without me doing anything. – potato Feb 15 '18 at 03:12
  • 1
    Indeed this appears to be the case. After all it would be incredibly stupid for GitHub to literally `git clone` a whole new repository (even a "bare" one) every time someone pushes the "fork" button -- that would be an incredible waste of storage, and likely an attack vector as well. – Greg A. Woods Mar 23 '18 at 23:37
11

I think fork is a copy of other repository but with your account modification. for example, if you directly clone other repository locally, the remote object origin is still using the account who you clone from. You can't commit and contribute your code. It is just a pure copy of codes. Otherwise, If you fork a repository, it will clone the repo with the update of your account setting in you github account. And then cloning the repo in the context of your account, you can commit your codes.

Daniel Shen
  • 2,986
  • 1
  • 14
  • 13
11

Forking is done when you decide to contribute to some project. You would make a copy of the entire project along with its history logs. This copy is made entirely in your repository and once you make these changes, you issue a pull request. Now its up-to the owner of the source to accept your pull request and incorporate the changes into the original code.

Git clone is an actual command that allows users to get a copy of the source. git clone [URL] This should create a copy of [URL] in your own local repository.

aliasav
  • 3,048
  • 4
  • 25
  • 30
5

Apart from the fact that cloning is from server to your machine and forking is making a copy on the server itself, an important difference is that when we clone, we actually get all the branches, labels, etc.

But when we fork, we actually only get the current files in the master branch, nothing other than that. This means we don't get the other branches, etc.

Hence if you have to merge something back to the original repository, it is a inter-repository merge and will definitely need higher privileges.

Fork is not a command in Git; it is just a concept which GitHub implements. Remember Git was designed to work in peer-to-peer environment without the need to synchronize stuff with any master copy. The server is just another peer, but we look at it as a master copy.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Deepak G M
  • 1,650
  • 17
  • 12
  • 10
    Huh? A fork gets all the branches, though you have to know where to look (hint: `git branch -a`). – tripleee Jan 12 '16 at 05:52
4

In simplest terms,

When you say you are forking a repository, you are basically creating a copy of the original repository under your GitHub ID in your GitHub account.

and

When you say you are cloning a repository, you are creating a local copy of the original repository in your system (PC/laptop) directly without having a copy in your GitHub account.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Harshal
  • 7,562
  • 2
  • 30
  • 20