18

In my scenario, I have a program that analyzes data input files and produces other data output files. I want to version control the program, and I want to version control the data files, and as a matter of preference, I want to have the working copy of the data files within the working copy of the program. I want the program and data to be version controlled separately to reduce "noise". The program does not have a dependency on the data files.

If I use git submodules, then when things happen within the data directory (committed updates I think), the version control for the program notes that there's an update with the submodule. Which'd be useful if the program depended on the data, but it doesn't.

In such a scenario, is it possible to have a working copy within another working copy without using git submodules?

Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338

2 Answers2

23

It's possible to simply "nest" working copies in Git. So if you clone your program repository, then inside that make a clone of your data files, then you can work with them independently. When Git performs file operations, it searches up the directory tree looking for a .git directory, so Git operations performed in the data repository won't affect the program repository. If you do this, you may want to add the name of the data directory to .gitignore to reduce noise from the program repository.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • Love to hear this. Is there anything that could go wrong with this approach? Or it just works? – Chris Calo Jan 13 '12 at 14:14
  • 3
    @ChristopherJamesCalo: I don't know of anything that can specifically go wrong. The only thing I can think of to watch for is if you run `git clean -f -x` in the program directory, then it would delete the entire data directory even if it appears in `.gitignore`. – Greg Hewgill Jan 13 '12 at 18:40
  • 1
    Sounds great. Been using this a lot today, and it seems to work perfectly. Thanks! – Chris Calo Jan 14 '12 at 05:22
  • The single disadvantage of using this approach over e.g. submodules is that many CI tools only support one repository (optionally with submodules) so integration with these tools will be somewhat more complicated. – ubuntudroid Jun 04 '14 at 06:42
2

This is also a useful workaround methinks in working with Heroku and rails in situations where you have plugins that need to be versioned. Currently, Heroku does not support git submodules, so nesting working copies seems like the best solution. In this case, you wouldn't want to add the plugin directory to .gitignore, as in this case you would fail to upload your plugins on pushing to heroku, but se la vi. Hopefully, Heroku will eventually support git submodules.

metasoarous
  • 2,854
  • 1
  • 23
  • 24