Preventing user to see other branches in repository

Question

We are using Bitbucket and Git. Is it possible to set up a user who can SEE JUST ONE BRANCH in repository, which he can work on?

Even if you could, also if you could limit the user only to clone that branch, he can still create new ones locally. Which problem are you trying to solve here? — Lasse V. Karlsen, Jan 05 '18 at 11:34
Creating new ones locally is OK. We are having outsourcing worker which should have approach just to that branch. We should merge that branch into our develop branch pretty frequently. The thing is that branch should initially contain all code from develop branch except one file which is protected property. That's why I'm looking for answer to this question. — SFin, Jan 05 '18 at 11:41
If it's 'protected' as in it contains SSH keys or similar, it probably shouldn't be in git and certainly not in the same repo as the rest of the code. — Holloway, Jan 05 '18 at 11:45
@Holloway It contains calculations, so it is a code. And we need that code in Git since it's updated. Right now we are handling this by using two repos, but looking for simplification. — SFin, Jan 05 '18 at 11:52
or you can encrypt this file using eg: https://github.com/AGWA/git-crypt — Rumid, Jan 05 '18 at 15:29
Possible duplicate of [How to keep/maintain public and private code in the same repository? (at repository hostings)](https://stackoverflow.com/questions/23242342/how-to-keep-maintain-public-and-private-code-in-the-same-repository-at-reposit) — phd, Jan 05 '18 at 16:22

Mark Adelsberger · Answer 1 · 2018-01-05T16:49:32.630

UPDATE

Comments on the question suggest another approach: You could encrypt the secret file in the repo, and not share the decryption key with the contractor. If the encryption is good, this is a viable solution and easier than any of the approaches below. It still makes me a little uneasy, if I'm being truly paranoid, because encryption schemes that people think are "secure enough" get compromised. Still, it's an option.

Also it suggests another approach:

Suppose you're not using git lfs with this repo. Then you could use git lfs to track the secret file. (This is not what lfs is really for, but bear with me.) You would then not share your lfs store with the contractor, so the contractor should not configure their clone of the repo to use lfs (or, if they do, they'll just get errors when trying to resolve the secret file, which is still ok).

This idea would work because lfs replaces the file in the repo with a "pointer file" containing an SHA-256 hash of the actual file. Without access to your lfs store, there is no remotely practical way to infer the actual file content from this hash.

The trouble with this is, if you want to use lfs for its intended purpose (managing large binary files), then you'd need to share the lfs store, and so the lfs store can no longer hide secrets. (Well, there's maybe a way around that, but it starts to get weird again.) Even then, there's an option of setting up your own clean/smudge filters that work like lfs (substituting a "meaningless" pointer for the original file, etc.), though that's a bit more of an advanced configuration to sort out.

The down-side to any of these solutions is that the change-tracking functionality of git is hindered. For example, you can't directly diff this version of the secret file against that version of the secret file if the stored versions are encrypted or replaced by pointer files; you'd have to check the two versions out to working trees and diff them on the file system.

Anyway, onward with the original answer...

This is not at all an easy problem to solve.

What MaSiMan says is more or less true, but doesn't address the potential complexity of sharing changes between the fork and the original on an ongoing basis (without leaking the secret file into the fork). And of course you may already have your repo with the secret file and a bunch of history, and no time machine. So in that case... still doable but perhaps more involved.

I'm going to walk through what it would take to make this approach work, mostly in the hopes it will convince you that it isn't practical. So if you want to take my word for it, skip down to the next horizontal rule, after which I'll suggest an alternative that you probably won't like (but that still beats the heck out of this approach).

So you do need at least two repos. I don't know of any hosting software that allows user-level control over reading specific branches, so you need a repo that the contractor can fully read.

Start by creating a branch you could safely share. It sounds like you had a plan in mind for that, but a word of caution: If the plan were to create the branch, delete the secret file, and then commit to the branch...

x -- x -- x -- A -- x -- x <--(master)
                \
                 \ # rm secret-file
                  \
                   B <--(contractor)

...the contractor branch isn't as "clean" as you might think, because A is just as much "on the contractor branch" as it is "on master".

To make this work, your second repo has to be a shallow clone.

git clone --depth=1 -b contractor url/of/origin

Your hosting software may not provide for making a "shallow fork", but as long as it will host a shallow repo you could at least create the clone manually and host it as the contractor's origin repo. Then you just have to coordinate pushes/pulls between the repos. Again this may depend on the hosting software, but worst case you could set up a local clone of the shallow repo with the original repo as a second remote.

If you want the contractor to have a more complete history (of everything except the secret file), you could use git filter-branch to produce a sanitized history.

git checkout master
git checkout -b clean-master
git filter-branch index-filter='git rm --cached --ignore-unmatch -- path/to/secret-file' -- clean-master

Now you have (in a single repo)

A -- B -- C <--(master)

A' -- B' -- C' <--(clean-master)

where secret-file is nowhere in the history of clean-master. So you create a clone containing only the clean-master branch

git clone --single-branch -b clean-master url/of/origin

Now to be ultra-paranoid, I should point out that this should create a "clean" clone, but conceivably a git implementation could take shortcuts in handling of pack files so that information would "leak" from the original branches into the new repo. If this worries you, you could check for the presence of the secret file in the new repo (i.e. locate its BLOB ID(s) in the original and see if the clone knows any object by those ID(s)), or maybe just force the garbage collector to run in the clone

git gc --aggressive --prune=now

which ought to do it since the fresh clone won't have any reflog history yet.

Anyway, you can set this new repo up in your hosting software (which doesn't even have to put up with shallow clones anymore), and again you'll have to work out how to push/pull between the two.

With either solution, there's still a question about how to share changes between the "clean" branch and the regular branch(es) without accidentally leaking the secret file. Any attempt to merge between the two will have pitfalls, and in any case the history of the "clean" branch can never include such a merge. Using merges for this probably just isn't practical.

And when you see what I suggest instead, you might wonder at how this could be the "more practical" alternative... like I said, this is not an easy problem to solve.

So you probably have to share changes back and forth by rebasing - which is not something I would normally recommend, because it means you are constantly creating parallel commits that "make the same changes" in two different branches. This also means you'll have to use tags (or some similar mechanism) to keep track of which changes on each side have already been shared to the other side.

So no matter which method you used to create the "clean" branch, initially all changes are present in both branches. So tag both heads.

git checkout master
git tag shared-to-clean-master
git checkout clean-master
git tag shared-to-master

Now some work goes on in your dirty branch, and the contractor does some work that gets pushed to your clean branch, and eventually you have

... A -- B -- C -- D <--(master)
    ^shared-to-clean-master

... A' -- X -- Y -- Z <--(clean-master)
    ^shared-to-master

No you rebase changes between the two.

git checkout clean-master
git checkout -b clean-to-dirty-temp
git rebase --onto master shared-to-master clean_to_dirty_temp
git checkout master
git checkout -b dirty_to_clean_temp
git rebase --onto clean_master shared-to-clean-master dirty_to_clean_temp

While you might think "I'll just script that", keep in mind that the rebase operations can generate conflicts. In particular, if the "dirty" side contains changes to the secret file, then you want those to conflict so that you can resolve the conflict by making sure the file doesn't pop into existence on the "clean" branch.

So this yields

                     X' -- Y' -- Z' <--(clean-to-dirty-=temp)
                    /
... A -- B -- C -- D <--(master)
    ^shared-to-clean-master

                      B' -- C' -- D' <--(dirty-to-clean-temp)
                     /
... A' -- X -- Y -- Z <--(clean-master)
    ^shared-to-master

Now you want to advance the master and clean-master refs, probably using either fast-forward or squash merges (depending whether you want to preserve the original commit granularity).

git checkout master

# then either...
git merge clean-to-dirty-temp
# ... which should fast-forward, or
git merge --squash clean-to-dirty-temp

# and finally
git branch -d clean-to-dirty-temp
git tag -f shared-to-clean-master

And of course on the other side

git checkout clean-master

# then either...
git merge dirty-to-clean-temp
# ... which should fast-forward, or
git merge --squash dirty-to-clean-temp

# and finally
git branch -d dirty-to-clean-temp
git tag -f shared-to-master

Let's suppose you fast-forwarded the dirty branch and squashed the clean branch. This would then leave you with

... A -- B -- C -- D -- X' -- Y' -- Z' <--(master)
                                    ^shared-to-clean-master

... A' -- X -- Y -- Z -- BCD <--(clean-master)
                           ^shared-to-master

and you should be able to verify that

git diff master clean-master

shows only that the secret file is missing from clean-master. Rinse and repeat.

Oh, except if either side branches and merges, then the rebase gets more complicated. In fact if either side ever does an "evil merge", you get into some real trouble. What this probably means is, if either side does contain branches and merges, you'd need to simplify the history before rebasing changes back and forth. There are many ways to do this, but it's another layer of complexity. As one example

git checkout shared-to-master
git checkout -b clean-to-dirty-temp
git merge --squash clean-master
# and proceed with the rebase from here

And of course, to keep the branches from diverging enough that this is an even bigger mess than it's already guaranteed to be, you'll have to do it frequently.

Ok, so let's face it: the above solution sucks. What to do?

Well, you basically need to remove the secret file from the repo, so that you and the contractor can just share the same ongoing change history.

But you need that file, and you may need it to be version-coordinated.

So give the secret file its own repo, for your eyes only. Then create a third repo that contains no files of its own, but references both the shared repo and the "secret file" repo as submodules.

score 0 · Answer 2 · answered Jan 05 '18 at 13:01

0

A potential solution for you scenario might be to fork your repository before adding the secret file. The outsourcer could then work on the fork and you can pull his changes back into your main repo.

answered Jan 05 '18 at 13:01

MaSiMan

655
6
16

Preventing user to see other branches in repository

2 Answers2