Uncontroversial (I think) and incontrovertible (I think):
- If you need to version-control some set of files, you must put them in a version control system.
- If the VCS is distributed (as Git is) and will send all the files somewhere else (as Git will—they're in commits and only whole commits can be shared), that's not suitable to your use case.
- But you'd like to use Git for the rest of the files.
Conclusion: you'll need at least two VCSes and/or repositories.
The second VCS can also be Git, as long as you make sure not to distribute (at least in a public fashion) the second repository.
Submodules can work. A submodule is nothing more than a reference to some other Git repository: in essence, a URL and one specific commit hash ID. However, each Git repository controls its own work-tree. If the files must be intermingled ("live in the same folder"), and you'd like to use submodules anyway, you have a couple of options:
- deal with it (see below);
- don't use the Git work-trees as your actual work area;
- if your system supports symbolic links, and there aren't too many files involved, have one repository contain symlinks to the files that are actually in the other repository's work-tree; or
- use nothing but symbolic links, so that your apparently-unified work-tree is just a tree of symlinks to files in the desired sub-directory.
Note that if you do choose to use submodules, you may want to structure this as a single superproject that does nothing but hold the two submodules that hold the commits of interest (and maybe the forest of symlinks, if you use the last approach here). That makes the two submodules entirely independent of each other, since a submodule has no awareness of its superproject. Only the third repository—the superproject—will know that there are two other Git repositories involved, and it's the superproject that arranges the submodules to appear in positions in which the relative symbolic links (if you use those) work.
If that's overkill, just pick one of the two independent Git repositories to be the superproject. The other Git repository will get cloned as a subdirectory of the superproject.
About "dealing with it"
Let's say that repository A has commits containing only public files, and repository B has commits containing only private files. When you clone both A
and B
, you get two different work-trees. At most one of them can be path/to/files
, but you want files from both A and B to appear in path/to/files
. If A's work-tree is named path/to/files
:
cd path/to; git clone url-of-A files
then B's work-tree is definitely not path/to/files
, and the files checked out from B
won't initially appear in path/to/files
. Let's say, for instance, that B isn't a submodule at all and you now—while still in path/to
—run:
git clone url-of-B sensitive-files
You now have files/*
(with Git-repo cloned into files/.git
) holding files extracted from repo A, with sensitive-files/*
holding files extracted from repo B. You can now manually, or with a script, copy or symlink the files that are managed in sensitive-files/*
so that they can be accessed via names that have the form files/*
.
The copies or symlinks that appear in files/*
can be listed in files/.gitignore
so that they do not get added to repo A's commits. Or, if you are using symlinks, note that what's in A/sensitive-file-1
is actually just the path name ../sensitive-files/sensitive-file-1
. It's probably OK to commit the symlink to repo A, even though repo A is accessible from anywhere, because whoever gets a copy of it knows only that the program(s) in A might read and/or write to a file named sensitive-file-1
. The actual contents of that file live not in path/to/files
but rather in path/to/sensitive-files
, so they never appear in repo A at all.