4

I have an application that has a lot of UI, DB, ... and also a background processing part. I want to keep the background processing part hidden from my contractors so they can't disclose the clever algorithm I came up with.

What's the right way to do that with git?

I started out with submodules: I have the main repo with all the UI, DB, ... in there and a secret repo which I included via a git submodule. After I change stuff in the processing part of the app, I minify it (it's JS), and save it to the main repo, so it can be used by the rest of the application. This way the contractor can pull my minified file as part of the main repo. Then run the app without the need for the secret code. So far so good.

However, the problems are:

  1. The contractor gets errors when checking out the main project because the submodule is not accessible for him
  2. The contractor has accidentally created a "remove processing submodule" commit, because the submodule folder is not there, and if he does git add . and then commits, git assumes that he wanted to delete the submodule.

So all in all it seems that's not the right strategy. I read that git subtree can be used, but I didn't find any example on how to use it for that use case.

Any help is highly appreciated!

stoefln
  • 14,498
  • 18
  • 79
  • 138
  • 2
    Git, being from the GNU-ish "source wants to be free" tradition, makes it quite hard to do this. Don't try to do it *in* Git; if you need to do it, do it "around" or "outside" Git. – torek Nov 10 '21 at 18:22
  • 4
    Wrap your secret sauce behind a library. Distribute the library as a compiled binary, and distribute the headers needed to use it, but don't distribute the source for the library. – Alecto Irene Perez Nov 10 '21 at 19:29
  • This would come with a lot of refactoring effort. I have dependencies which go both ways. I don't th8nk that's an option... – stoefln Nov 22 '21 at 22:20
  • 2
    Is it not sufficient to make the subcontractors sign some form of NDA? – 0x5453 Nov 22 '21 at 22:25
  • It doesn't matter what you want to do or what you think you can't do. You can't hide part of a git repository. Git doesn't support this workflow. You need to split the code into multiple git repositories or share the code/binaries in a different manner. – Lasse V. Karlsen Nov 23 '21 at 08:17

3 Answers3

3
  1. The contractor gets errors when checking out the main project because the submodule is not accessible for him

That's not really an error, submodules are separate histories because they're separate histories, if you don't need one of the pieces just don't init/update it, and the contractor doesn't need it. --init --recursive is a common git submodule update option set, but there's a reason they're options. This is why.

  1. The contractor has accidentally created a "remove processing submodule" commit, because the submodule folder is not there, and if he does git add . and then commits, git assumes that he wanted to delete the submodule.

Yes, this is a downside of Git switching git add to default to also implicitly tracking removal in the 2.0 transition. One way to stop this happening in the future is for the contractor to enable sparse checkout and include everything except that submodule, git sparse-checkout set '*' '!that/submodule', or they could do it manually as a one-off with git update-index --skip-worktree that/submodule; or you could do one of your mergebacks --no-commit and I think git checkout --ours that/submodule would restore the index entry before committing the merge, from then on Git will see no change on the contractor's history so it won't conflict with anything you do.


Another way to manage this is to keep public and private histories as in this answer where you --amend the public commit to not include the content you want concealed, and do all future updates from private to public with similarly-edited squash merges followed by a -s ours mergeback as shown there.

jthill
  • 55,082
  • 5
  • 77
  • 137
1

You can try the other way around :

  • have the public part in a self contained repo,
  • include that public part as a submodule in a bigger, private repo that has the complete source code.

When you build the source code ("building" is minifying if I understand correctly), you just have to place the output of the build in the subrepo part.

LeGEC
  • 46,477
  • 5
  • 57
  • 104
1

An elegant solution could be to encrypt the non-public files in your repository and use Git "clean/smudge" filters to make the encryption/decryption procedure fully transparent:

  • you will have filters and keys in place, so you can work in the repository as usual
  • your subcontractor won't have the keys (and thus disables the filters), so they will only see the encrypted content.

The approach is simple and straightforward. See this example from the gitattributes manual page:

[filter "crypt"]
    clean = openssl enc ...
    smudge = openssl enc -d ...
    required

Caveat: checkout and commit operations will incur a slight performance hit due to the encryption/decryption process. Also, the contractor will still have full access to commit logs -- so you need to check if this meets your requirements.

Alex O
  • 7,746
  • 2
  • 25
  • 38