git re-write history by squashing as many commits as possible by the same user

Question

I have git repo with 50k files. I want to make a new version of this repo that

Has only 5k files
Keeps only the commit history of those 5k files.
For one user (me), squashes their commits to as few as possible

For 1 and 2 I believe there are already answers on S.O. For example this one

For 3, I can't imagine any trivial way to do it. It seems like I'd want to squash every commit by me until some commit by someone else touches a file I touched since my last commit. Is there a command combination to do that or would I have to write some script to figure it out?

In other words. There are files A, B, C,

commit 1: I edit A
commit 2: I edit B
commit 3: Other user edits C
commit 4: I edit A
commit 5: I edit C
commit 6: Other user edits C

At this point 1, 2, 4, 5 can be squished so the new history would be

commit 1: Other user edits C
commit 2: I edit A, B, A, C
commit 3: Other user edits C

This can get complicated pretty quickly. E.g. what if commit 6 was "Other user deletes C", then the optimal post-squash sequence would become simply "commit 1: I edit A, B, A". And we haven't even got to merges yet. :) — jingx, Nov 24 '21 at 22:59
Yea, I know. Even a non-optimal solution (squash consecutive commits by me) would be useful if it's simple to do. I might be able to do that with an interactive rebase by doing some regexs to replace pick with squash if the same user appears on consecutive lines. — gman, Nov 24 '21 at 23:44
My gut feeling would be to go with the concept in your last comment- don't re-order any commits. — TTT, Nov 25 '21 at 04:41

git re-write history by squashing as many commits as possible by the same user

0 Answers0