38

I am using git to manage a C++ project. When I am working on the projects, I find it hard to organize the changes into commits when changing things that are related to many places.

For example, I may change a class interface in a .h file, which will affect the corresponding .cpp file, and also other files using it. I am not sure whether it is reasonable to put all the stuff into one big commit.

Intuitively, I think the commits should be modular, each one of them corresponds to a functional update/change, so that the collaborators could pick things accordingly. But seems that sometimes it is inevitable to include lots of files and changes to make a functional change actually work.

Searching did not yield me any good suggestion or tips. Hence I wonder if anyone could give me some best practices when doing commits.

PS. I've been using git for a while and I know how to interactively add/rebase/split/amend/... What I am asking is the PHILOSOPHY part.

Update: Thanks for all the advices. Maybe this should be learned from practicing. I will keep the problem open for some time to see if there is more suggestions.

Elnur Abdurrakhimov
  • 44,533
  • 10
  • 148
  • 133
Ivan Xiao
  • 1,919
  • 3
  • 19
  • 30
  • "I think the commits should be modular" -> use tags for that. Tags don't have to be for version numbers. Commit as often as needed, while you don't break the build. – glmxndr Jul 01 '11 at 05:29
  • 1
    @subtenante: using tags for that is probably not the best idea. you will lose completely track of important tags – knittl Jul 01 '11 at 07:42
  • 1
    @knittl: can't see why. Use naming conventions for "important" tags, and use a pattern when listing them. – glmxndr Jul 01 '11 at 08:21
  • 1
    @subtenante: unfortunately git does not scale well with too many tags – knittl Jul 01 '11 at 08:32

7 Answers7

22

I tend to commit as you propose: a commit is a logically connected change set. My commits can be anything from a one-liner to a change in all files (for example add/change a copyright notice in the source files). The reason for change need not be a full task that I am implementing, but it is usually a milestone in the task.

If I have modified something that is not related to my current commit, I tend to do an interactive add to separate out the unrelated changes, too - even when it is a whitespace tidy up.

I have found that commits that simply dump the working state to repository makes them a lot less useful: I cannot backport a bugfix to an earlier version or include a utility functionality in another branch easily if the commits are all over the place.

One alternative to this approach is using a lot of tiny commits inside a feature branch, and once the whole feature is done, do heavy history rewriting to tidy up the commits into a logical structure. But I find this approach to be a time waster.

vhallac
  • 13,301
  • 3
  • 25
  • 36
18

This is exactly the use case, for which the index, the staging area, was introduced in git.

You can feel free to do as many changes unrelated to each other as possible. Then you choose what all are related and then make several atomic commits in one shot.

I do it all the time. If you use git-gui or any of the other GUI clients, you can choose not only the file that you want to commit, but also hunks within the files, so your commits are as atomic as possible.

Tim Kist
  • 1,164
  • 1
  • 14
  • 38
lprsd
  • 84,407
  • 47
  • 135
  • 168
  • 2
    You can also use `git add -p` to selectively commit hunks from the command line without using a GUI client. – Dave Sherohman Jul 01 '11 at 09:40
  • This is true. I usually use `gitx` to do the interactive staging job. – Ivan Xiao Jul 06 '11 at 15:50
  • 2
    git add -i is even more powerful than -p. Gives you a menu based cli system where you can update whole files, add patches and more. – Zefira Dec 18 '12 at 21:59
  • You can also rebase interactively to add or remove stuff from past commits in a topic branch if those commits were already created (`git rebase --interactive HEAD~x` where x is how many commits back). [Here is why](https://medium.com/@fagnerbrack/one-commit-one-change-3d10b10cebbf) one should always create atomic commits (and what they are exactly). – Fagner Brack May 07 '16 at 13:54
18

I try and follow these practices in the order...

  1. A commit must not fail a build. Most important!

  2. It should be made of one logical unit of change - whether a single line/character or a whole file/class with corresponding changes in other parts of code, still following #1.

    What is a logical unit of change? In terms of git, if you can specify the changes in the commit message in least number of characters, in one sentence (without ANDs of-course), and you can not break that description further into smaller units, that I call one unit.

  3. Commit message should clearly specify the essence of the commit.

  4. Commit message should be small, typically no greater than 80 chars. Any more elaboration should be part of the description.

Sailesh
  • 25,517
  • 4
  • 34
  • 47
  • 6
    I find the 4th point confusing. Commit message is the whole text in the editor, which should obviously be as long as needed to explain what the commit is about. It should be structured as 1 line "subject" of at most *50* (so say the git's own coding guidelines) characters followed by blank line followed by whatever detailed explanation is needed. – Jan Hudec Jul 01 '11 at 07:29
  • 1
    The "subject" as you mentioned is the "message" by their terminology, and the rest comes under "description". – Sailesh Jul 01 '11 at 07:35
  • 1
    Yes I follow these strategies as well. But still, there is circumstance that even if I follow these practices, even a single change of function can introduce a fat commit – Ivan Xiao Jul 06 '11 at 15:50
  • 2
    I wouldn't necessarily blame Git if you need a fat commit in order to change a single function. It's just as much how you structure your code, separate concerns and encapsulate data. – theodorton May 02 '12 at 00:25
12

Disclaimer: I too am in the process of trying to work out what commits should be, and how the final history should end up looking. However, I wanted to share some of the resources that I've come across during my own research.

First off, the Linux Kernel project has a great page on Merge Strategies for getting your code merged upstream. They talk about making bite-sized commits; doing one or more refactoring commits before the actual additions you want (the refactorings are supposed to make your feature cleaner of course ;) and other things.

My other favorite page is Git Best Practices by Seth Robertson. This is not only a page on a lot of best practices for using git, but it also is a tremendous resource, containing enough information about a broad variety of git topics to make googling for more in-depth information trivial.

Zefira
  • 4,329
  • 3
  • 25
  • 31
  • 3
    Upvoted for the Seth Robertson link, thanks. I'm now familiar with the term "Sausage Making" with relation to Git, I can die in peace. –  Feb 13 '15 at 12:36
5

What I am asking is the PHILOSOPHY part.

I think I can answer this because I have been involved in some personal research recently.

One should focus in creating an atomic commit. Which means that it's necessary to take some extra care in a few things for a commit:

  • It shouldn't have any value if done partly
  • It shouldn't break the build
  • It should contain a good message and body for traceability (with tickets reference whenever possible)
  • It shouldn't contain a lot of diff noise (whitespace and style changes, unless the commit is specific for that)

Commits should be focused in one change, and one change only. Anything more than that can have bad side-effects.

Some people might argue that this is too much, that it is not practical. But the best argument in favor of it, even for small companies, is the fact that bulding atomic commits will force your design to be more decoupled and consistent, because one requirement to achieve full optimal atomic commits is to have a healthy codebase that is not a mess.

If you force good commit practices consistently, you will be able to drive the engineering culture and the code itself to a better state.

Fagner Brack
  • 2,365
  • 4
  • 33
  • 69
3

Sometimes when you do big refactoring, it's inevitable that you change many files in one commit. When you change interface of a class, you have to change the header, the implementation and all places that use the interface in one commit, because no intermediate state would work.

However, the recommended practice is to change the interface without actually introducing any new functionality first, test that you didn't break existing functionality and commit that. Than implement the actual feature that needed the updated interface and commit that separately. You will probably end up doing some adjustments to the refactoring in the process that you'll squash to the first commit using interactive rebase.

That way there is a big commit, but it does not do anything hard, just shuffles code around, so it should be mostly easy to understand even though it's big and than second commit (or more, if the feature is big) that is not too big.

Jan Hudec
  • 73,652
  • 13
  • 125
  • 172
1

Something that very much helped me in working out what I was committing, and why, was moving our repository organisation over to the 'feature branch' model, as popularised by the Git Flow extension.

By having branches describing each feature (or update, bugfix etc) that is being worked on, commits become less about the feature and more about how you are going about implementing that feature. For example, I was recently fixing a timezone bug within its own bugfix branch (bugfixes/gh-87 for example), and the commits were split up into what was done or the server side and the front end, and within the tests. Because all of this was happening on a branch dedicated to that bug, (with a GitHub issue number too, for clarity and auto closing), my commits were seen as the incremental steps in solving that problem, and so required less explanation as to why I was doing them.

Community
  • 1
  • 1
beseku
  • 917
  • 1
  • 6
  • 12