Version Control on Per File Basis

Question

I regularly undertake computational fluid dynamics (CFD) modelling simulation for large (ish) projects and am looking for a better way to manage different file versions. I have some (only in one kind of workflow) experience with git and svn, though am fairly convinced they don't satisfy all my needs here.

A typical workflow is this:

Setup the initial file for a calibration simulation.
Tweak this file and rerun (repeat step 1)
When happy with the calibration, setup the validation run from the best set of parameters
When calibration/validation all completed, set up files for design runs: ie. Scenario A, Scenario B, Scenario C
Whoops, you forgot to output something critical in Scenario C, so you have to make a slight change and then re-run the simulation

Current practice is to rename each successive run 001, 002, etc. This makes things unclear when the 'best' calibration might not have been the highest run number, ie. you tested a few more things, but they didnt necessarily improve things. Or, this is confusing when you need to do comparisons between the results of the latest Scenario simulations, and Scenarios A and B are both 001, but Scenario C is 002.

Ideally a version control system would be able to log each successive commit and show comparisons to older versions. A bonus would be being able to track the 'Best' version in the branch, ie. the best calibration simulation so far and be able to quickly revert to that version.

Secondly the system needs to allow creation of a new file from the template of a given version of an existing file. Ie. When creating the files for the scenario simulations, A B and C, start with a copy of the latest calibration/validation setup, then make changes that define the different scenarios (different time of simulation, different geometry, etc.) and commit those Scenarios - Then be able to do a quick compare between scenarios, and see which versions they all have in common.

Thirdly, a last feature that may break the possible uses of many systems is to be able to have several versions of a file checked out in a folder at any one time. Ie. before the weekend i might set up 10 simulations with slightly different parameters and run them all in a batch so that on Monday morning I can pick the best, then collapse the tree to the best current one.

Ideally this would have a nice GUI (think github for windows) that can see the recent changes of any file, and commit one files changes individually, and most importantly revert a single file to a previous version individually. I want to basically only have the desired calibration, validation, Scenario A, B and C files in the folder at any time, but with the option to checkout several versions of one, and collapse these later.

A look around online suggests such a niche system doesnt exist, and i am not adverse to creating something myself. But does anyone have a good work for an existing system (Git?) that may cover all or at least most of my requirements? Or perhaps any tools online that I can start with and hack together in some form? or best of all, a wonder system that covers the lot...

score 1 · Answer 1 · answered May 12 '16 at 08:07

Ideally a version control system would be able to log each successive commit and show comparisons to older versions

Any VCS have it as basic functionality

A bonus would be being able to track the 'Best' version in the branch

Bookmark/tag/label (different names in different VCS for the same task "mark specially some point in history")

Secondly the system needs to allow creation of a new file from the template of a given version of an existing file

Branching, again - core of most (all?) VCS

last feature that may break the possible uses of many systems is to be able to have several versions of a file checked out in a folder at any one time.

How do you imagine this? Folder1/FileA must be unique for this path, it must be Folder2/FileA...FolderN/FileA or Folder1/FileA1...Folder1/FileAN for any OS and|or FS

Anyway "some amount of Working Dirs with different versions checkouted into it" isn't also a big problem for any VCS (only details of implementation may differ due to different nature)

Well, according to the above notes you can use existing VCS, and almost any VCS (nice GUI may eliminate only some in some cases, f.e. Git on Windows)

The issue with branching is I would ideally like to have multiple branches checked out in the one folder at any time. The simulations are often executed in batch, or in parallel on a linux cluster. It is obviously quite possible to have multiple branches/clones in different folders. But then submitting jobs becomes a lot more complicated and a whole new folder structure is required. — , May 12 '16 at 22:07
@TDevlin - [Git-worktree](https://git-scm.com/docs/git-worktree)? See also http://stackoverflow.com/a/30185564/960558 (I don't use Git, not tested worktree, only read), or [Share Extension](https://www.mercurial-scm.org/wiki/ShareExtension) in Mercurial, if you'll **not change** checkouted data — Lazy Badger, May 13 '16 at 04:29
@TDevlin - but my choice (because I **hate** Git and prefer to use pure vanilla solutons, when it's possible) will be 1) Mercurial 2) prepare each set with `hg archive -r REV -t files destname-%R` (%R in order to differentiate which dir is for which revision in repo) 3) run tests — Lazy Badger, May 13 '16 at 04:41

Version Control on Per File Basis

1 Answers1