0

Use case: I want to be able to develop in a private git repository (let's call this private-repo) as some the code in my project is sensitive and not for release to the public. I also want to be able to publish branches 'safe' where sensitive code has been removed to a public remote repository (public-repo).

To this end I have set up a private (private-repo) and public (public-repo) repositories on GitHub. I have completed my the removal of sensitive code in a branch (private-branch) in private-repo and pushed to my-app-v.1.0 in the same repository (private-repo). I then delete private-branch.

I now push my-app-v.1.0 from private-repo to a branch my-app-v.1.0 in the remote public-repo.

Given that my-app-v.1.0 never contained the sensitive code in either private-repo or public-repo, would it be possible for someone with advanced knowledge of Git to recover versions of the project containing sensitive code where they only have access to public-repo?

And also, if the answer to the above is no, is it necessary for me to delete private-branch in private-repo?

Hopefully the diagram clarifies the state of the repositories at the end of these operations.

enter image description here

Makoto
  • 104,088
  • 27
  • 192
  • 230
gbro3n
  • 6,729
  • 9
  • 59
  • 100

2 Answers2

2

There is a possibility of sensitive data leaking in this approach. If you push that commit and it still has the history of your previous commits attached to it, then the sensitive code is still recoverable.

In this particular scenario, the simpler approach would be to not use Git in this fashion, but rather physically decouple the history of the two repositories. The best way to do this is to create a new repository that's meant to be public in a separate folder, copy over all of the code that is safe into it, then push that repository. From there, Git has no clue where the original history came from, and it'll be a lot cleaner as master would only ever contain "safe" code.

Makoto
  • 104,088
  • 27
  • 192
  • 230
  • Could I then set the public repo as a remote for the private local branch continue to push updates from the private local branch? Would Git accept the changes (and presumably still not know the history of the code) or would it error because it doesn't share the same ancestory? Guess I can just manual copy, but it's extra work. – gbro3n Oct 29 '15 at 17:04
  • If I understand you correctly: you want to use the public repo to push updates to the private one, or vice versa? – Makoto Oct 29 '15 at 17:05
  • vice versa - private to public after a manual copy to the public repository. – gbro3n Oct 29 '15 at 17:10
  • That has a higher risk to it...the whole idea of this was to physically decouple the two histories and ensure that nothing in private would go into public. – Makoto Oct 29 '15 at 17:16
1

It depends on what you mean by "removed". If you deleted whatever content you wanted to hide and committed that change, then the change is in the history and could be located by someone looking at the history and your diffs from previous revisions.

Github has a help file on the subject of your repo / history containing sensitive information:

https://help.github.com/articles/remove-sensitive-data/

It's not very simple, and it involves all committers being vigilant about how they treat revisions, rebases...

It is probably more straightforward if you keep your sensitive data in separate files and never commit them at all, or keep them in another repo that you don't ever publish and adapt your code to get the sensitive data from there.

Dan Lowe
  • 51,713
  • 20
  • 123
  • 112
  • I may need another option then. Though it would have been useful to keep the full history, the sensitive code (config) is not essential. Maybe I can just create a new repo or use a hard reset as described here? http://stackoverflow.com/questions/1338728/delete-commits-from-a-branch-in-git – gbro3n Oct 29 '15 at 16:59