1

I've set up an experimental branch in git, but I'm trying out a few different implementations of a feature, using different libraries. I'd like to keep a record of these, as my own documentation of that approach/library. My pre-git instinct is to comment out the previous experiment --- but what's the best way to handle multiple experiments in git (or version control generally)?

EDIT I should clarify these will often be little bits of code (really, parts of features), perhaps only 3-10 lines long. For example, using getenv("HOME") vs. wordexp on ~; using strcpy vs. memcpy. So there could be a lot of them, for alternatives on each step of even a relatively simple feature.

As a first stab, I've made a different branch for each version - but that will get unmanageable fast.

My current guess is:

  1. implement the first experiment
  2. commit it
  3. delete the first experiment
  4. implement the second experiment
  5. commit it

Then I can just look at the log to find the particular experiment. EDIT log entry will then be commented and dated.

(Actually I'd probably like to instead comment out the first experiment while working on the second one, so I can refer to it easily - then delete it just before committing the second one).

Is that a good workflow? Is there a better/standard approach? Many thanks for sharing your experience!

13ren
  • 11,887
  • 9
  • 47
  • 64

2 Answers2

2

Using branches is the right way. It shouldn't become unmanageable. That's exactly where git shines. Branch off a commit, with a name that explains what's going on (e.g. git checkout -b featureX_fooLib), work on that, then branch off the same commit you originally branched off for the next test (e.g. git checkout -b featureX_barLib a41f38), and work there on the next example. You coulld also tag where you are before starting any of these branches - git tag featureX_libTestsBase - then for each new branch use that as the new starting point - git checkout -b featureX_bazLib featureX_libTestsBase.

I prefer not to have cruft from other places in my code, so I wouldn't want to be committing commented-out versions of old code in each new branch, but that doesn't mean I wouldn't do it; I just wouldn't have it be part of the barLib commits. One thing you could do is - from the working, final commit of the fooLib branch, save out a duplicate of the file, and don't add it to git. Now when you checkout the new barLib branch, it'll be there as an untracked file. You could also - from the new barLib branch - simply stomp over that commit's version of the file with git show featureX_fooLib:filename >filename. Now you can go in, comment out that bit, start working, and simply patch-add the new stuff for each new commit. This might be the best of both worlds.

Gary Fixler
  • 5,632
  • 2
  • 23
  • 39
  • I'm not sure about `git branch` having pages of output though! Thanks for tips on tag, an untracked copy, and show. BTW: turns out `show` unhelpfully uses `:` instead of `--`: `git show featureX_fooLib:filename >filename` http://stackoverflow.com/questions/610208/how-to-retrieve-a-single-file-from-specific-revision-in-git – 13ren May 17 '13 at 21:35
  • What do you mean by '`git branch` having pages of output?' How many branches are you making? – Gary Fixler May 18 '13 at 12:47
  • For this one feature (not finished yet) I have seven alternatives so far. (It's not 7 alternatives of one thing, but 2-3 alternatives for each sub-part of the feature). That's just one feature - and an extremely simple one at that! A linear extrapolation predicts pages of alternatives really fast. Though, realistically, there's reuse of techniques, so the explosion is mainly when entering new areas - but that happens a lot if you're doing new stuff. [To answer your question directly: 3. I've only made three experimental branches, because I switched to individual commits for each alternative.] – 13ren May 18 '13 at 14:24
  • On one branch, they're all going to show up in a long line that's hard to work with. If you separate them out, you could not only choose to see concise history for only one (`git log --oneline alt7`), but also for several, interlinked (`git log --oneline --graph --decorate alt1 alt2 alt3`), or alias viewing several together (`git config alias.bestAlts log --oneline --graph --decorate alt1 alt4 alt8 alt23`). You could trash any that didn't pan out (`git branch -rm alt2 alt9`), or get a list of current alternatives (`git branch`), and even work with that list in an editor (`git branch | vim -`). – Gary Fixler May 18 '13 at 20:23
  • Thanks! `git log` as an expt browser... `--graph` is useful to see their relationships. It might look nicer if each branch grouped together alternatives for a specific part (leaves of tree) - very complex with 50 branches, one commit each! (I was thinking there's no commit message to document the alternative - but each branch also has a commit!) How would `git branch` give a list of "*current* alternatives"? I like @twalberg's idea of namespacing, but wildcards don't seem to be recognized for branches - `git log experiments/*` doesn't work. But you can `git log --grep=expt --oneline --all`. – 13ren May 18 '13 at 22:33
  • I'm not sure what you mean by "current alternatives." If you want the names of your branches, use `git branch`. If you want to know which ones you've most recently worked on, `git log` (and all the options we've been using) sort commits in order, newest at the top. Checkout a branch and commit something, and it moves to the top. `git log --all --graph --decorate --oneline` will make it easy to see all the branches, sorted, newest at the top. If it goes off the screen you'll be in the pager, so `d` and `u` will move you up and down half-pages. – Gary Fixler May 19 '13 at 00:42
  • I don't know if there's a git command to show you just the time-sorted names of all branches, but you can roll your own, e.g.: ``git branch | sed 's/^[* ]*//' | while read b; do echo `git log -1 --format=%ct $b` $b; done | sort -nr | cut -d' ' -f2``. Newest will be at the top. This also gives you a hook for finding particular ones. Alias that, e.g. to `brio` ('branches in order'), then you can `brio | grep ^alt` to see just the ones that start with `alt`. – Gary Fixler May 19 '13 at 00:45
  • Here's the bash alias: ``alias brio='git branch | sed "s/^[* ]*//" | while read b; do echo `git log -1 --format=%ct $b` $b; done | sort -nr | cut -d" " -f2'`` – Gary Fixler May 19 '13 at 00:49
  • For "current alternatives", I was just quoting you: *or get a list of current alternatives (`git branch`)*. I wasn't sure what you meant either... that's what I was asking with *How would `git branch` give a list of "current alternatives"?* :) – 13ren May 19 '13 at 13:13
  • Thanks for working out the alias. I'm leaning towards your first suggestion, of `--graph` - as you say, it's sorted by time already. One concern is that it will look too complicated, with many branch lines on the left margin; but if most of the branches are leaf nodes (i.e. with just one commit), it might look more like one main trunk, with flowers dotted along it (as it were). I may have to try it, to see how it goes. Anyway, many thanks again! – 13ren May 19 '13 at 13:21
  • I will just add that I felt the `--graph` output looked big and scary when I started using git 7+ months ago, but forcing myself to use it, to follow the lines and connect the dots, it's all become second-nature. Branches with hundreds of commits, forks and merges all over the place, and it's all quite sensible and easy to read. – Gary Fixler May 19 '13 at 13:34
  • I see re: the 'current alternatives' now. I just meant that each library provided an alternative approach to the solution, so each of your branches would be one of your alternative solutions with a different library than the others. The list of 'current alternatives' would just be the list of branches you had created thus far. – Gary Fixler May 19 '13 at 13:38
  • RE: `--graph` I'm sure I'd improve with time; but I prefer to minimise complexity when I can! RE: 'current alternatives' - the problem is separating experiments for different features, i.e. when I have a bunch of branches for one feature, and then another bunch for the next feature. I could distinguish them by time, or even `--grep` them by name, but it would be nice to use a standard feature to group them... that was a benefit of grouping by branch. IDEA: there'll be a common ancestor for all alternatives of a specific feature... and I think `log` can show all commits with a common ancestor. – 13ren May 19 '13 at 14:08
  • Just thinking, another way to mark a new feature is with a `tag`. Might help to identify those common ancestors that begin a new feature. – 13ren May 19 '13 at 18:00
  • "the problem is separating experiments for different features, i.e. when I have a bunch of branches for one feature, and then another bunch for the next feature." - that's when you branch branches! You're describing a hierarchy, which git perfectly models. Commits on a branch track progress of a unified objective. Branches let you fork reality to try out alternate concepts. What you're trying to do instead is keep one linear branch on which you constantly overwrite reality. What if you want to revisit a few ideas and play with them some more, adding new commits? Branches make this easy. – Gary Fixler May 20 '13 at 01:36
  • Yes, that's what I meant by "IDEA: there'll be a common ancestor..." - great minds think alike. :) It's the right approach in theory, but we'll see about the practice... – 13ren May 20 '13 at 16:37
  • You know how I was worried about too many branches? For "current branches", there's `git branch --contains ` - which lists only those branches containing that commit. So, `` can be the ancestor branch representing the "current feature" that groups all its alternatives. Note that this uses a commit, not a branch... MAYBE: create many branches, but *think* of them as commits. The tree of commits is what is really wanted. – 13ren May 20 '13 at 18:42
  • Actually `git branch --contains ` also works when `` is a branch. So, with 26 branches *featureA, ..., featureZ*, then `git branch --contains featureA` lists only *featureA featureA_expt1 featureA_expt2 featureA_expt3* etc. You can create another expt branch with: `git checkout -b featureA_expt4 featureA`. I think this works from whereever you are - e.g. from the branch where you just added a different expt - but not sure. (However, sometimes it might make sense to checkout the base featureA first, anyway, if that's the starting point. But I often reuse bits, so maybe not.) – 13ren May 20 '13 at 18:56
  • `git branch --contains HEAD^` may be a bit more intuitive, to list all the alternatives to the one you're one you're working on. – 13ren May 20 '13 at 19:03
  • Branches, AKA heads are just pointers to the latest commit on a branch. When you give git a branch name, under the hood it resolves to the commit where that branch head points. Most commands that takes commits also take branches, tags (which are also just pointers to commits), and all of the above with relative offsets. Under the hood you're always working on and with commits. – Gary Fixler May 20 '13 at 19:15
  • "You can create another expt branch with: `git checkout -b featureA_expt4 featureA`." - Right. Most of the first paragraph in my answer gives examples of this. I didn't specifically point it to a branch, though I used a tag in one and a commit hash in another, but all of these - tag, commit, branch - point to a commit under the hood. I didn't give an example like yours - branching off a branch name - as I presumed you'd have a couple of commits on the branch at that point, and I thought you'd want to branch all the branches from a common base, not from the head of each experimental branch. – Gary Fixler May 20 '13 at 19:20
  • Thanks! Yes, reviewing your answer, I see what you mean. Your tags and commits also make sense. I've rediscovered bits of your answer for myself in these comments (e.g. tags is 8 above). My first concern, of "pages of branches", is addressed with `git branch --contains `. A similar one for log is: `git log --graph --branches=featureA* featureA^..`. The `--branches` includes those leaves; `featureA^..` limits how far back it goes. Do you think that's the best way to do it? git log seems to have many other options... – 13ren May 20 '13 at 20:11
  • Also in your answer you say (after getting a copy of the previous experiment) *comment out that bit, start working, and simply patch-add the new stuff for each new commit.*. I guess that by *patch-add* (`git add -p`), you mean to skip the comments? i.e. a way to both have the comments as a reference, but not commit them. Is that right? It's funny, because that's how I did it the first time... but (maybe because of my inexperience) it was really *really* confusing to track which version of the feature I was working on. – 13ren May 20 '13 at 20:20
  • Yeah, that's what I meant. If I really needed it to stick around for awhile, I might commit it in, but if it's just helping me get the next test written, I'd really not like to have it in my history. Commented-out code is usually a bad thing. No one - often including the author - ever knows if it's okay to delete it. It becomes fossilized in the code forever. I suppose you could version it and delete it when you're done. Then it's in there for future reference of how you wrote the code, but at least it's not sticking around in current versions. – Gary Fixler May 20 '13 at 20:55
  • 1
    (Did you see my second last comment, about `log`? http://stackoverflow.com/questions/16615680/experimental-branch-how-to-record-multiple-alternative-implementations#comment23962575_16616423) – 13ren May 21 '13 at 14:18
  • I did. I think you have the tools you need now. You just need to start working with it and getting comfortable with it. I'm 7 months in, and still learning new flags and features, and simpler workflows. It makes more sense all the time. Log is a non-destructive thing. It just shows you commits. You can experiment with its flags to see what they do, and find what works best for you. I've never noticed the `--contains` flag, so I'll try that out and see how I like it, and keep it if it works for me. – Gary Fixler May 21 '13 at 19:24
  • Great! Actually, I started trying out your suggestion a couple of days ago - my last few comments are my experiences with it. Having hundreds of branches is a creative solution, but seems unconventional for git, because not supported directly by the tools - perhaps it should be! e.g. if `git log` had `--contains` instead of having to get the branches by globbing (or your `git branch | sed 's/^[* ]*//'` nice solution BTW; for detached head, need an extra filter e.g. `git branch | sed 's/*\|(no branch)//g'` - that rigmarole would be unnecessary with proper `git log` support.) – 13ren May 21 '13 at 20:05
  • BTW: The confusion I had with add-patching was because I started with a mass of three different experiments overlaid and commented out, which I then tried to separate into different commits with `git add -p`... that was a *really* bad idea/workflow! – 13ren May 21 '13 at 20:45
  • Also, I should add I've been using git on and off for [3 years](http://stackoverflow.com/questions/2119480/changing-the-message-of-the-first-commit-git) - it's not like I know *nothing* about it; it's just that I only look into new ways of doing things as I need them. (I find that study without a need doesn't stick in my head very well). I'm a fellow-git(ter), not a newbie - no matter how much I might seem the latter! :D – 13ren May 21 '13 at 20:46
  • Actually, I do much more complicated patch-adding through the fugitive plugin in Vim. I do things like patch add 2 or 3 argument changes in a function signature, or patch add chunks, then edit the chunks in the index when I notice problems, commit, then pull the changes from the index back into the working tree, but even sometimes then only the parts I care about. – Gary Fixler May 21 '13 at 21:13
  • That doesn't sound more complicated than untangling 3 layers of changes, some parts contiguous, some not - but I haven't seen the details of what you're doing, so maybe it is. Good to know about fugitive, I use vim too. – 13ren May 21 '13 at 21:25
  • If you're saying you were making changes inside the commented code, I agree that could get difficult to tease out. What I'd originally thought you were doing was copying, say, a function from a current test back to your original file (the base version, pre-testing anything), duplicating it, commenting out the top one, then working on the one below it. This would be easy to patch-add, even if you had several in a row - just uncomment all, save, patch add in the first one and commit, patch add the next and commit, etc. – Gary Fixler May 21 '13 at 21:43
  • That's the workflow I suggested, but not how I actually started. What made it difficult was that each experiment wasn't in a contiguous block, but mixed together - some had code before and after a base section, while others had different bits before and after that base section (had to sort out `ABcA'B'` manually); some added bits were shared by some experiments but not others (`abcd`, where `A=ab`, `B=bc`). I also had trouble remembering which commit I was doing when distracted by other issues - that's partly inexperience. Not impossible - I did it - but tedious, confusing, not-recommended. – 13ren May 21 '13 at 22:14
  • I'm designing a good JSON hierarchy today, and the code that uses it. It made me think of this thread. I was going to commit the ideas on multiple branches, but decided to go with commits on the same branch, as each idea is going to be a full replacement of the previous one, as I work out the idea to a deeper level each time. They *are* alternate takes on the idea, though - reconfigurations of the hierarchy with different needs, etc. One thing in favor of single-branch commits: fugitive's `:Glog` command lets you step back and forth through your buffer's commit history in a separate buffer. – Gary Fixler May 23 '13 at 01:38
  • It sounds like a mixture of alternatives/improvements, with "a deeper level each time" favouring one branch. I'll be interested to hear how you go in the last step, of selecting out the idea you like best. I had both single- and multi- approaches. I simplified by checking out the expt I liked, made a new branch, then squash/fixup all the steps. Being able to quickly switch between versions (`:Glog`) sounds great! I've been checkingout+`:e`; diffing; and `git show` - it's a bit awkward. I guess fugitive could equally easily switch between branches - if it knew which one is "next" and "prev". – 13ren May 23 '13 at 03:52
  • Well, if I were trying to select from the alternatives, I'd go with branches, but I'm going from "spill out my thoughts" to "refined idea." There's a chance I could end up back at one of them by the end, but I find this kind of thing self-sorts such that I create the worst option first, and the best one last, as I really come to understand the problem. It's not guaranteed, but it's likely. Also, I can lie by interactively rebasing it into whatever order I want at the end ;) – Gary Fixler May 23 '13 at 04:04
1

Keep one branch per experiment and give them suitable names, maybe even "namespacing" them (i.e. git checkout -b experiments/testing-algorithm-A). Initially, these branches will live only in your local repository, but you could push them up to origin or a different repository so you have redundant copies of the experiments, just in case...

twalberg
  • 59,951
  • 11
  • 89
  • 84
  • Namespacing would help management, but there'd still be pages! I've tried commits just now, and I like being able to include a message, and having it dated. I agree branches make more logical sense though. – 13ren May 17 '13 at 21:41