2
Using git and a workflow where I have many loose changes that are not intended for check-in. Is there a good git way to manage those not-for-check-in modified files?

In my project, we have about 700,000 source files. I'd call it a larger project.

When I am working on fixing a bug or implementing a feature, I will quite frequently end up with many files that I have made ancillary edits. Such as debugging instrumentation, or alternative implementation, or an expensive check for a never-happen situation that once appears to have happened in the wild and I want to catch it if it ever happens on my machine, or clang-format because the original had goofy formatting.

To commit my good changes, I'll branch, I carefully add the relevant files and commit those. (Followed by a push of my changes. Make a PR. Get code review approval. Jenkins builds on all the dozen different target platforms, and runs the test suite. Then I merge my branch into main.)

Probably a fairly typical workflow... except for that I have many (1000+) not-for-check-in files that I want to keep modified in my worktree, but not merge those into main. That latter part is probably atypical.

With Perforce, I would add my not-for-check-in files into a not-for-check-in changelist and park them there. They'd be out of the way, and I could not accidentally pull one of those "tainted" files without taking steps to move it out of the not-for-check-in changelist.

So far, my git tactic of being super-duper careful has worked, but seems fraught with peril. I maintain a stash.txt file that has a list of my not-for-check-in files, and frequently stash them to temporarily get them out of the way, do my git things (making branches, fetch, merge, push, whatever), and stash pop them back in my worktree. Seems janky, manual, and error prone; high cognitive load. Has to be a better way.

(I have not run into the scenario when I have a single file that has both good changes and not-for-check-in changes. If/when I do, I am aware of how to add-and-commit hunks of changes.)

I have tried the tactic of making a branch, add-and-commit both my good changes and not-for-check-in changes. Then cherry pick the good changes for what should go into main. That scales poorly with the 1000s of not-for-check-in files that need to be sifted through.

Any advice or guidance is appreciated.

Eljay
  • 4,648
  • 3
  • 16
  • 27
  • "With Perforce, I would add my not-for-check-in files into a not-for-check-in changelist and park them there." And so too with Git, yes? – matt Nov 11 '21 at 12:59
  • @matt • Does Git support multiple concurrent `index` (or staging), which would be the analog to Perforce `changelist`? – Eljay Nov 11 '21 at 13:27

2 Answers2

2

Frame challenge: this is not a git problem, but an application architecture problem.

Leaving a large number of changes uncommitted in your working tree means:

  • they are probably not backed up anywhere; if your development machine suffers a fault tomorrow, you will lose all this work you've been carefully preserving
  • nobody but you is getting the benefit from them; quite possibly your colleagues are wasting time making the same changes over again
  • the version of the application you're working on is gradually diverging from the one actually deployed to production

Let's look at the example changes you mention:

  • debugging instrumentation - commit it behind a "development-mode only" flag, a set of debugging feature flags, or improve your debug tooling so that it's not necessary (e.g. you rarely need "dump and die" statements when you have a working interactive debugger)
  • alternative implementation - should be on its own branch until it is ready; or committed behind a "prototype" feature flag so that you can turn it off to reproduce live behaviour
  • an expensive check for a never-happen situation that once appears to have happened in the wild and I want to catch it if it ever happens on my machine - again, an ideal use for a feature flag
  • clang-format because the original had goofy formatting - just commit it, and everyone will thank you in the long term
IMSoP
  • 89,526
  • 13
  • 117
  • 169
  • Good points. In my case: they are backed up (2 different ways); unlikely anyone else would benefit or make similar changes in the code; true but do I mitigate that. I do push good debugging instrumentation but hold back on the unsuitable for sharing debugging instrumentation; if the seeds show promise they go in their own branch and follow that path; dev only personal code is not allowed in the code base (even behind a feature flag); someone else has a re-format all the code project so this is interim to avoid the "last one to touch it owns it". – Eljay Nov 11 '21 at 14:27
  • 1
    I've had the same problem as the OP (*usually* on a smaller scale) and while I agree with your suggestions, the big issue in my case was an inability to convince others that things like reformatting and the debug instrumentation were good ideas. The formatting in particular is one reason I really like project-enforced formatting (clang-format, gofmt, etc). – torek Nov 11 '21 at 15:28
1

Using git worktree, I would work with two separate working tree (from the same cloned repository: no need to clone twice)

  • one for the work in progress, with many files not to be added
  • one for reporting the work which needs to be added: no stash to maintain in this one.

Does Git support multiple concurrent index (or staging), which would be the analog to Perforce changelist?

Not really: it would be easier to make multiple commits:

  • one your PR
  • one for the rest

And push only the first commit (for PR).

From the discussion:

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • I'm familiar with `git worktree`. It does not solve this workflow problem, because the files not to be added and the files to be added are already in the same worktree, and need to be in the same worktree. The not to be added files are a byproduct of the work of the to be added files. – Eljay Nov 11 '21 at 13:36
  • @Eljay Can that byproduct be added in a .gitignore? – VonC Nov 11 '21 at 13:37
  • @Eljay I have edited the answer to address the "changelist" comment. – VonC Nov 11 '21 at 13:39
  • Not sure if `.gitignore` is the right tool for the job. Maybe. These are *tracked* files; I'm not sure of the ramifications for things like `git pull` for changes in the worktree for `.gitignore`'d *tracked* files. – Eljay Nov 11 '21 at 13:40
  • @Eljay Agreed. I thought "byproduct" meant "generated from". But if they are tracked, there is no `.gitignore` without a `git rm --cached` first, which is not what you want. – VonC Nov 11 '21 at 13:41
  • I've tried the multiple commits, which did not work for me (user error, no doubt), because subsequent PRs for the relevant changes also applied the not-for-check-in changes from the previous commit. Is there some sort of "ignore this commit hash" from being propagated in a subsequent push? – Eljay Nov 11 '21 at 13:42
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/239121/discussion-between-vonc-and-eljay). – VonC Nov 11 '21 at 13:44
  • I'm confident that this answer with the chat discussion answers my question. Thank you VonC! – Eljay Nov 11 '21 at 15:33