0

I have a project that started with a single directory. Then I added directories to it but structured the main project with subdirectories that matched sub-projects. One of these subdirectories is, say lib/, which contains the common ground for all my projects based on the same architecture.

Since lib/ has become a project worth its own GIT (sub-)tree, I'd like to make it independent but I don't want to lose all the related commits I made while working in the main project. What I'd like is something that would be a copy of my main repository along with its entire commit history with everything stripped off but files in lib/.

So I saw it can be done.

I must confess I do not always understand GIT jargon very well so, please, bear with me.

I did git clone -l <main project> lib then I ran git filter-branch -f --prune-empty from directory lib/. I ran git status and it told me the origin and this "branch" differ and I should run git pull... Hmmm... I'm using only local repositories so I tried git remote rm origin and the message went away. However I suppose there's a shortcut to avoid this, right?

Anyway, what I see in the log tree is now all commits... or stuff, whatever in triple:

$ git log --reflog --graph --oneline --decorate --date-order
* 880d3e8 Framework Library - Update
| * 2cfbb42 (refs/original/refs/heads/1.0) Framework Library - Update
| | * 578968f (HEAD -> 1.0) Framework Library - Update
* | | 65daea4 Tools: ECU simulator (new)
| * | 62981c7 Tools: ECU simulator (new)
| | * 9e4015d Build 423 - Makefile bugfixes and small changes
* | | 3eddb88 Build 423 - Makefile bugfixes and small changes
| * | 82b5ed1 Build 423 - Makefile bugfixes and small changes
* | | bb46ee9 Build 423 - Bugfixes
| * | 7cd40ac Build 423 - Bugfixes
* | | ab0058c Build 420 - Bugfixes
| * | 3f3257b Build 420 - Bugfixes
| | * 2f2184f Build 416 - Enhancements and fixes
* | | 39ea1de Build 416 - Enhancements and fixes
| * | 11c1f0f Build 416 - Enhancements and fixes
| | * 770d628 Build 406 - Enhancements
* | | 952f9a2 Build 406 - Enhancements
| * | f0c86c3 Build 406 - Enhancements
| | * 5b8cfef Build 405 - Bugfixes and enhancements
* | | 6c1b590 Build 405 - Bugfixes and enhancements
| * | 0e79341 Build 405 - Bugfixes and enhancements
...

Is that normal? How can I trim the redundant ones?

I'm working with local repositories only and am not planning shortly to use distant repositories. Well, unless I'm missing something, of course.

Oh, and I do have a backup. (If it were just one...)

2 Answers2

0

What git filter-branch does is rewrite history, i.e re-create new commits without the stuff you filtered out.

So the reason you might be seeing three copies is due to old history lines still being there after the filter-branch. Running the garbage collector with prune set to now, or all, should fix this.

git gc --prune=now

Yazeed Sabri
  • 346
  • 3
  • 17
  • I tried this but the "duplicates" are still there... I did another `git clone` and the duplicates are now gone. –  Jun 28 '17 at 06:51
  • Sorry this my fault, you are right the git gc won't do anything, I blinked on the fact that git log would show commits reachable from the refs (HEAD, branches). The reason behind this, apparently the filter-branch leaves the original refs for you in case you want to reverted back to the old state, you can find them in refs/original folder. – Yazeed Sabri Jun 28 '17 at 07:12
  • Don't worry ;-) . Indeed I recall that GIT does keep the deletes as references, it doesn't actually remove anything. Only cloning operations, which I've just understood, do the cleaning by copying only the (what I'd call) active or living references. –  Jun 28 '17 at 07:29
0

I think I'm starting to grasp how GIT works — right, better late than never. Turns out all I had to do is another clone:

$ git clone lib lib-new
$ cd lib-new
$ git remote rm origin
$ git log --reflog --graph --oneline --decorate --date-order
* 578968f (HEAD -> 1.0) Framework Library - Update
* 9e4015d Build 423 - Makefile bugfixes and small changes
* 2f2184f Build 416 - Enhancements and fixes
* 770d628 Build 406 - Enhancements
* 5b8cfef Build 405 - Bugfixes and enhancements
* 44421b9 Intermediate build - Added `wait()` function template to class `Scheduler`
* 5fdc840 Build 395 - Bugfixes and enhancements
* c8b34e1 Build 375 - Bugfixes
* 12cb53f Build 371 - Bugfixes and enhancements
* 981d3f8 Build 360 - Enhancements
* f5127b6 Build 356 - Major bugfix
...

To summarize all operations in one go:

# From the parent directory
git clone -l project-with-lib lib-temp
cd lib-temp

# Detach from the origin:
git remote rm origin

git filter-branch -f 'rm <list of unwanted files/directories>'
git filter-branch -f --prune-empty
cd ..
git clone -l lib-temp lib

# Detach from the origin:
cd lib
git remote rm origin

# Scrap the temporary work space:
cd ..
find lib-temp -delete

The first clone operation creates a directory tree that needs to be used as a temporary work space — I'm all working with local repositories so there's no git push to make this naturally happen, it must be done by hand, hence the second cloning operation, which does the house cleaning.