How to version control files which shouldn't be pushed to a remote repository?

Question

I'm collaborating with my team on a Rails API repository, and we have a .env file which contains our environment variables. This file is included in the .gitignore file, since the environment variables include sensitive tokens which we don't want to check into version control or push to Github.

This creates a problem when I run into bugs related to those environment variables. Everybody on the team seems to have a slightly different (or sometimes substantially different) set of environment variables that they use for their local setup (long story, I know this is the root cause and needs to change but as a new consultant I have limited ability to change this for now).

A short-term workaround I thought of would be to have a local git repository with no remote, just for the .env file (and any similar files which shouldn't be pushed to remote master). This would at least give me the ability to roll back, check diffs to see exactly what changed, and read commit messages for reminders about why changes were made. But since there's already a git repository tracking changes to the majority of the codebase, I'm unsure if creating this 2nd repo is possible.

I read this post, but it seems to run into the same problem when it comes to the .gitignore file.

I also considered using git submodules, but are those only for subdirectories or would they also work if the two .git/ folders are siblings in the same directory?

torek · Answer 1 · 2019-08-09T17:07:41.163

Uncontroversial (I think) and incontrovertible (I think):

If you need to version-control some set of files, you must put them in a version control system.
If the VCS is distributed (as Git is) and will send all the files somewhere else (as Git will—they're in commits and only whole commits can be shared), that's not suitable to your use case.
But you'd like to use Git for the rest of the files.

Conclusion: you'll need at least two VCSes and/or repositories.

The second VCS can also be Git, as long as you make sure not to distribute (at least in a public fashion) the second repository.

Submodules can work. A submodule is nothing more than a reference to some other Git repository: in essence, a URL and one specific commit hash ID. However, each Git repository controls its own work-tree. If the files must be intermingled ("live in the same folder"), and you'd like to use submodules anyway, you have a couple of options:

deal with it (see below);
don't use the Git work-trees as your actual work area;
if your system supports symbolic links, and there aren't too many files involved, have one repository contain symlinks to the files that are actually in the other repository's work-tree; or
use nothing but symbolic links, so that your apparently-unified work-tree is just a tree of symlinks to files in the desired sub-directory.

Note that if you do choose to use submodules, you may want to structure this as a single superproject that does nothing but hold the two submodules that hold the commits of interest (and maybe the forest of symlinks, if you use the last approach here). That makes the two submodules entirely independent of each other, since a submodule has no awareness of its superproject. Only the third repository—the superproject—will know that there are two other Git repositories involved, and it's the superproject that arranges the submodules to appear in positions in which the relative symbolic links (if you use those) work.

If that's overkill, just pick one of the two independent Git repositories to be the superproject. The other Git repository will get cloned as a subdirectory of the superproject.

About "dealing with it"

Let's say that repository A has commits containing only public files, and repository B has commits containing only private files. When you clone both A and B, you get two different work-trees. At most one of them can be path/to/files, but you want files from both A and B to appear in path/to/files. If A's work-tree is named path/to/files:

cd path/to; git clone url-of-A files

then B's work-tree is definitely not path/to/files, and the files checked out from B won't initially appear in path/to/files. Let's say, for instance, that B isn't a submodule at all and you now—while still in path/to—run:

git clone url-of-B sensitive-files

You now have files/* (with Git-repo cloned into files/.git) holding files extracted from repo A, with sensitive-files/* holding files extracted from repo B. You can now manually, or with a script, copy or symlink the files that are managed in sensitive-files/* so that they can be accessed via names that have the form files/*.

The copies or symlinks that appear in files/* can be listed in files/.gitignore so that they do not get added to repo A's commits. Or, if you are using symlinks, note that what's in A/sensitive-file-1 is actually just the path name ../sensitive-files/sensitive-file-1. It's probably OK to commit the symlink to repo A, even though repo A is accessible from anywhere, because whoever gets a copy of it knows only that the program(s) in A might read and/or write to a file named sensitive-file-1. The actual contents of that file live not in path/to/files but rather in path/to/sensitive-files, so they never appear in repo A at all.

Hi @torek, can you clarify what you mean by `if the files must be intermingled`? I do need the files to exist in the same folder, but I expect the git repositories will be mutually exclusive, and each file should belong to either one repository or the other (never both). — Richie Thomas, Aug 09 '19 at 16:43
By "intermingled" I meant "files from repository A are to be found in `path/to/files`, and files from repository B are to be found in `path/to/files` as well". That means that either `path/to/files` *must not* be the work-tree of *either* A or B, *or*, you have to copy the files from one of the two work-trees (A or B) into the other work-tree (second of the four bullet points—probably a matter of manual copying, or a script that does it) or use the third or fourth bullet point approach. — torek, Aug 09 '19 at 16:57

score 1 · Accepted Answer · answered Aug 09 '19 at 17:57

1

Git doesn't require that a repository or its work tree be in any particular place. You can point it at the specific work tree and/or repository you're using and it'll use those rather than finding them.

So leave your main repository untouched, and maintain a sideband repo for your .env file.

git init ~/orts.git
git config --global alias.orts '!git --work-tree "$PWD" --git-dir ~/orts.git'

and then you can do e.g.

git orts add .env

answered Aug 09 '19 at 17:57

jthill

55,082
5
77
137

I ended up doing a less-sophisticated version of this. The repo whose `.env` I wanted to version was in the same parent directory as several other projects for the same client. I actually wanted to version all their `.env` files together, so I ran `git init` in that parent directory and white-listed just the `.env` files in that parent's .gitignore file. BTW, out of curiosity- what does "orts" refer to here? Are you using it to mean "a scrap or remainder of food from a meal", i.e. similar to "crumbs"? If so, I look forward to showing people the "git crumbs" command on my machine lol. – Richie Thomas Aug 09 '19 at 20:50
That's pretty much it, leftover bits from the main event. I wouldn't call the setup you came up with less sophisticated, looks to me like you're getting a good grip on how this works. If you're going to share a single repo across multiple worktrees like this I'd look into making it bare and managing the indexes explicitly too, the `git` command doesn't take an index file argument, the override for that is environment-only, I'd venture that's just because nobody cared enough yet to offer a patch for it. – jthill Aug 09 '19 at 21:05

score 0 · Answer 3 · answered Aug 07 '19 at 19:27

0

we have a .env file which ... we don't want to check into version control

So ... you have a file that you want to keep synchronized, but don't you don't want to use version control for it. So, use something else? A shared filesystem, or a tiny website. Heck, if there's no router in the way, trivial FTP will do the trick.

You could encrypt the file.

environment variables include sensitive tokens

That's the part I would re-think, because it's weird. Truly sensitive data is never in the environment.

answered Aug 07 '19 at 19:27

James K. Lowden

7,574
1
16
31

I actually do want to use version control to manage this file, I just don't want to push the git branch to a remote server. As for storing sensitive data in environment variables, I was under the impression that this was common practice. This is how tokens are stored on Heroku, among other platforms. See this post- https://stackoverflow.com/a/11300680/2143275 – Richie Thomas Aug 07 '19 at 19:46
To clarify- I want to use a 2nd git repository locally, to make it easier for me personally to manage the .env file. Meanwhile, the larger team will continue to manage the rest of the project using the original git repository. Hope that makes sense. – Richie Thomas Aug 07 '19 at 20:05

How to version control files which shouldn't be pushed to a remote repository?

3 Answers3

About "dealing with it"