59

Google stores all its codebase in a single repository called piper [1] [2] [3].

It has an approach that is very different than open source alternatives do (centralized 'cloud' service) and aims at scaling to a repository with billions of files, thousands of developers and millions of commits [1].

It doesn't seem Google open-sourced it nor plan to do so (contrary to their build system blaze and some other tools [4]).

Are you aware of any open source version control system with an approach similar to piper?

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
Colin Pitrat
  • 1,992
  • 1
  • 16
  • 28
  • 2
    Do you need to store 2 billion lines of code? in terms of free cloud based vcs bitbucket is very generous. – Tomos Williams Oct 03 '17 at 15:41
  • 2
    Not myself but I've met companies that are nit far from that and currently use hundreds of git/hg/cvs repositories, with dependencies between them. Updating the version of the "common" components shared by nearly all repositories is a nightmare. So those releases are rare and other repositories contain code that should be common and is duplicated because it's easier than integrating it in the existing common repo. There are also other problems of findability, integration testing, etc ... Basically, all the reasons that Google give for using a single repository. – Colin Pitrat Oct 03 '17 at 19:42
  • In that case I'd probably looking at something like SVN – Tomos Williams Oct 04 '17 at 10:05
  • The claim is that “lots of repos with interdependencies” is complicated, and “they keep copies of code...” as a solution, which is not a solution & contributes to the problem. I’m not sure which languages are in play in this scenario (it doesn’t matter too much except for illustrating w/ concrete examples), but the solution is to continue to use separate repos, and publish released, versioned artifacts to binary repositories (eg artifactory, nexus, etc), and in other repos declare dependencies using only versions. Eg see all java-based dev (incl Scala & other jvm langs) and C/C++ (Linux dev) – michael Apr 13 '18 at 19:42

5 Answers5

39

The short answer is no, it doesn't seem to exist.

As you can read in a Quora article, "it’s hard to tell where the version control system ends, and where some of the other parts of the development toolchain begin".

So, first, you need to be clear in what "features" you are interested in since you can be interested in a feature that is not Piper's responsibility.

Also, keep in mind that your server disk space and OS would limit the file count/size before the chosen VCS.

If you need a Centralized VCSs and billions of files, you could go with SVN or OpenCVS.

If you need a Distributed one with thousands of developers and millions of commits, take a look at Git, Bazaar, Bitbucket or Mercurial.

But do you really have all those requirements?

AFAIK there's no Piper's open source equivalent on the market.

In order to better understand Centralized and Distributed VCS, take a look at this Comparison between Centralized and Distributed Version Control Systems

Also, take a look at what is Google's repository like?

MiguelKVidal
  • 1,498
  • 1
  • 15
  • 23
  • 1
    "keep in mind that your server disk space and OS would limit the file count/size before the choosen VCS" -> That's the point: it's not the case with piper. It's a centralized VCS but a distributed service. The 'cloud service' approach allows 'infinite scalability'. The repository content is sharded. A 'checkout' is just metadata on the server side + mounting a 'network FS'. File operations translates to RPC to the service that maintain the status of your 'local copy'. Benefit: your 'local copy' is in fact potentially accessible to anybody. – Colin Pitrat Oct 03 '17 at 19:54
  • So for the features, I would say 'distributed service' with 'fuse interface' although it's more an implementation detail. There may be other ways to achieve the real requirement that is 'infinite scalability'. Plus the basic features you expect from a VCS of course: history, working copy, update and merge ... I also have the impression that it doesn't exist in opensource. A few proprietary solutions claim this (perforce, plastic) but I'm not sure how true this is. – Colin Pitrat Oct 03 '17 at 20:00
  • @ColinPitrat Perforce "claims" this, but I haven't used it once to verify. I don't know any opensource project with resources like what Google Piper offers. Right now, the best simple answer to your question would be a "no, there's no option similar to Piper". The vast majority of companies are very far from needing "infinite scalability" anyway. Keep in mind that Google Piper is only one app of the solution Google uses, involving many others tools. – MiguelKVidal Oct 03 '17 at 20:22
  • 1
    Yes and some of the tool are open sourced (e.g: bazel) so it would seem logical to also open source piper. Not sure why it's not the case ... – Colin Pitrat Oct 03 '17 at 20:59
  • Google used Perforce before switching to Piper – Hendy Irawan May 14 '18 at 18:09
21

Two recent developments bring Piper-like features to Git: VFS for Git and sparse-checkout.

The first: Microsoft recently open-sourced VFS for Git which feels like it brings some of Piper's monorepo features to Git.

VFS for Git virtualizes the filesystem beneath your Git repository so that Git tools see what appears to be a normal repository when, in fact, the files are not actually present on disk. VFS for Git only downloads files as they are needed.

VFS for Git also manages Git's internal state so that it only considers the files you have accessed, instead of having to examine every file in the repository. This ensures that operations like status and checkout are as fast as possible.

This is used by Microsoft for >4000 developers in a >300GB repo with >2 million commits in their Windows Git repository.

The second: sparse-checkout for Git v2.25.0 allows you to checkout just a subset of your monorepo. This should speed up commands like git pull and git status. See this blog post for more info. Unfortunately, you have to manually specify which subdirectories you want to check out with Git sparse-checkout, whereas Piper handles this transparently for developers.

michaelrbock
  • 1,160
  • 1
  • 11
  • 20
6

Google has built more than one version control tool. Piper is specialized for the needs of the google monorepo.

When google built android, it built gerrit and repo to handle version control. Repo is used to work with many git repositories at once, each of which may have its own maintainers and release cycles. Open source dependencies don't lend themselves to a monorepo, without the control of a single organization enforcing things such as a global build status or global refactoring. Also, the requirements of piper simply don't apply in most places, such as performance of commits keeping up with requests.

Mark
  • 181
  • 1
  • 3
3

There is no open-source equivalent to piper.

Note that piper is old and has an old-fashioned API dating from the perforce era. I guess you would want a more modern workflow, similar to what modern DVCS offer.

I'm pretty sure your codebase isn't as large as Google's 86TB repository. Do you really need the same thing?

I'm pretty sure you could use a monorepo based on git or mercurial. And maybe evolve to a virtual file-system such as VFS for git if you ever need it.

Jake Sylvestre
  • 936
  • 3
  • 14
  • 39
rds
  • 26,253
  • 19
  • 107
  • 134
1

Meta is open sourcing Sapling which is based on our internal source control system, but also has an added layer that allows it to be backed by a regular Github repository if you want the semantics without the scalability.

I've never worked at Google and don't know how different it is to their mono-repo setup, but I've been at Meta for six years and now that this is open source I have immediately transitioned to Sapling in all of my personal projects.

Chris Hawkins
  • 764
  • 4
  • 15