0

I cloned an svn repository using git-svn. But I exluded everything except one folder (let's say folder A) using --include-paths. Since the svn repository contains many commits for other files (outside of that folder), in the resulting git repository I now also have many commits that reflect the same state as their parent commit and are in that sense "empty". How can I remove these commits from my git repository before pushing it for the first time?

I found the --prune-empty option in git-filter-repo, but I seems to only get active on those commits that are affected by other command options. At least

git filter-repo --prune-empty auto

or

git filter-repo --path 'A' --prune-empty auto

didn't change anything for me.

I found examples using filter-branch (https://stackoverflow.com/a/5324916/15137778), but I would like to avoid that as it is super slow (and also recommends filter-repo in it's docs). Is there a way to do it with filter-repo?

Here is an example in case the explanation was not clear:

Let's assume I have the following svn repository.

root
  |-- A
    |-- 1.txt
    |-- 2.txt
  |-- B
    |-- 3.txt
    |-- 4.txt
  |-- C
    |-- 5.txt
    |-- 6.txt

Then I made a clone using git-svn only including the directory A which leads to this file structure in my new git repository:

root
  |-- A
    |-- 1.txt
    |-- 2.txt

But the git repository still contains commits that changed for example C/5.txt in the original repository. So apparently all commits from the original svn repository were kept. But since that file was filtered out of my git repository, these commits refer to the exact same tree as their parent (no changes). So they are useless and I don't want to keep them. Thanks in advance for any help or advice.

Pascal
  • 122
  • 6
  • 1
    The whole question is based on a misconception of what Git is. "But since that file isn't anymore in my git repository, these commits are empty" No they're not. They may not reflect a different state of the files within the directory A, but they are not empty commits. A commit is a snapshot of the _whole_ project. Your include-paths doesn't somehow magically change that basic truth. If you want to rewrite history, fine, use filter-repo. But obviously pruning empty commits will do nothing, as none of your commits _are_ empty. – matt Feb 09 '22 at 12:40
  • @matt Thanks for your comment! Ok, I get why the term 'empty' is wrong here. I may correct that in the question later. But I guess you understood which commits I ment – the commits whose snapshot is exactly the same as of their ancestor(s). So still, **how** do I remove those using filter-repo or something else? – Pascal Feb 09 '22 at 17:11
  • You can ask filter-repo to pretend that a certain file/folder does not exist. The docs are here: https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html – matt Feb 09 '22 at 17:25
  • The problem is, the folders that should be pretended not to be there (e.g. folder `B`) already aren't there. Because they were filtered out during git svn clone... So there is no way filter-repo could know which commits belong to those ancient not existing folders. That's why I want to use the criterion that the commit uses the same tree as it's parent. – Pascal Feb 09 '22 at 18:39
  • The --prune-empty option seems to be exactly what I want, but it only works on commits that are touched by another filter option, I guess. (BTW: filter-repo docu seems to call commits that have the same tree as their parent 'empty'. Same with filter-branch.) It might work, if I first clone the whole svn repo and only later remove the unwanted folders from my git clone. But the repo is huge... Trying `git filter-branch --commit-filter 'git_commit_non_empty_tree "$@"' HEAD` now (https://stackoverflow.com/a/5324916/15137778). Hopefully tomorrow it will be done, filter-branch is soo slow... – Pascal Feb 09 '22 at 18:39
  • You should use filter-repo, not filter-branch. – matt Feb 09 '22 at 18:44
  • "Because they were filtered out during git svn clone" But they were not filtered out of the commits. Also you can use inversion to say "keep only this one folder". Please read the docs. – matt Feb 09 '22 at 18:45
  • "But they were not filtered out of the commits." On what basis do you say that? I can confirm by checking out specific commits, that no files from folder `B` and `C` are in there. I think the `--include-paths` option of git-svn works similar to the `--path` option of filter-repo but the commits stay because it doesn't have anything like the `--prune-empty` option. I did read the docs. Actually you specify the paths you want to keep and use inversion to say "keep everything except the provided paths". – Pascal Feb 09 '22 at 19:14
  • I don't know if filter-repo has the option to discard commits it didn't touch itself, but if it does not, you could ask that this be added. The old filter-branch program does have that option. – torek Feb 09 '22 at 20:04
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Feb 19 '22 at 16:05

0 Answers0