27

I'm struggling for a number of hours now with the --ignore-paths option to git-svn, in an attempt to fetch only certain tags from a large repository.

I want to start the fetch at dev, which looks like

> svn ls http://192.168.0.2/svn/repo/corporation/dev
branches/
tags/
trunk/

The repository directory listing for the complete set of tags looks like this:

> svn ls http://192.168.0.2/svn/repo/corporation/dev/tags
Acme-4.x/
Acme-5.0/
Acme-5.1/
Acme-5.2/
Acme-5.3/
Acme-5.4/
Acme-5.5/
Acme-5.6/
Acme-5.7/
Acme-5.8/
Acme-5.9/

I want to ignore all tags before Acme-5.5.

I attempt to initialize and fetch thusly:

> git svn init  http://192.168.0.2/svn/repo/corporation/dev
> git config svn.authorsfile ../users.txt
> git svn fetch --ignore-paths="Acme-4.x|Acme-5.0|Acme-5.1|Acme-5.2|Acme-5.3|Acme-5.4"

Unfortunately, I still see fetches occuring in the Acme-4.x tag. The fetches are manifested on the console by telltale paths such as

A       ACME4.4/source/database/mssql/components/functions/vssver.scc

I've tried a bunch of variations on the regex, including full paths to the tags I want to ignore, which after an hour or two turned into utter regex thrashing and desperation. I've provided the --ignore-paths option to git-svn-init, all to no avail.

Would someone be kind enough to comment on why the regex's are not suppressing fetches on the paths specified in the ignores-regex.

Thanks.

ae6rt
  • 2,668
  • 3
  • 24
  • 25
  • Did you tried "`/dev/Acme-4.x`" as an ignored path? Or simply just one ignore path like "`dev/Acme`" a bit like in this test: https://mirrors.kilnhg.com/Repo/Mirrors/From-Git/Git/File/t/t9140-git-svn-reset.sh?rev=d96710024f76 – VonC Oct 06 '11 at 06:07
  • This ignore path (dev/tags/Acme-4.x) does not ignore Acme-4.x, nor does this one (dev/Acme-4.x). I can't see how the second one would ignore the path, even if ignores were working: the paths I want to ignore contain "tags/" as a path element. – ae6rt Oct 06 '11 at 16:15
  • Sorry, I meant `tags/Acme` not `dev`. – VonC Oct 06 '11 at 17:10
  • 1
    No problem. I figured similarly. This doesn't work either: git svn fetch --ignore-paths="tags/Acme-4.x". A real headscratcher because I proofed the regex's using a Perl testbed script. Given the paths the fetch encounters, my testbed script filters the right paths. – ae6rt Oct 06 '11 at 20:11
  • 1
    Remember to use \. for a literal period in a regex – Max Nanasy Feb 08 '15 at 07:15
  • Seems the git applies regexp to a string beginning from `branches`, `tags` or `trunk` keywords. If you trying to search from the beginning of the string (`^`), then you have to write something like `^[^/]+/blabla(?:/|$)` to apply ignores to all `blabla` subdirectories in `branches`, `tags`, `trunk` directories, otherwise the git would ignore the parameter. – Andry Apr 07 '19 at 21:59

7 Answers7

30

I was having this same problem today: my regexp would just never match... Make sure you know what the target paths actually look like. I was making an incorrect assumption about the structure of the paths that were being fed to my regexp.

To find out what the paths look like, make git-svn output each path to the console as it tests them:

NOTE: Just in case, make a backup copy of the git-svn file first!

  1. Open the git-svn script in a text editor. My script was <git-dir>/libexec/git-core/git-svn.
  2. Locate the is_path_ignored subroutine.
  3. Add a print statement above the first return statement, as follows...
sub is_path_ignored {
    my ($self, $path) = @_;

    print STDERR "$path\n"; //<-- **ADD THIS LINE**

    return 1 if in_dot_git($path);
    return 1 if defined($self->{ignore_regex}) &&
            $path =~ m!$self->{ignore_regex}!;
    return 0 unless defined($_ignore_regex);
    return 1 if $path =~ m!$_ignore_regex!o;
    return 0;
}

Now use git-svn again with the --ignore-paths switch.

I realised that instead of paths like trunk/baz it was actually using bar/trunk/baz

So instead of

--ignore-paths='^(?:trunk|branches|tags)/baz' 

I needed to use

--ignore-paths='^bar/(?:trunk|branches|tags)/baz'

Don't forget to remove the print statement from the git-svn script.

stucampbell
  • 6,383
  • 5
  • 26
  • 25
  • 11
    In newer versions of git this subroutine has moved to the Git::SVN::Fetcher module which can be found on Ubuntu and Fedora in /usr/share/perl5/Git/SVN/Fetcher.pm. – Blair Zajac Aug 12 '13 at 18:31
  • 4
    So what is the path relative to? Repository root? – Jan Hudec Mar 04 '14 at 19:52
  • On OSX Yosemite I found this routine in /Applications/Xcode.app/Contents/Developer/usr/share/git-core/perl/Git/SVN/Fetcher.pm – juanheyns Jan 07 '15 at 19:23
  • 4
    On windows I found it in `C:\Program Files (x86)\Git\lib\perl5\site_perl\Git\SVN\Fetcher.pm` – turbanoff Apr 07 '15 at 10:16
  • on OSX El Capitan, I found this routine in /usr/local/Cellar/2.7.0/lib/perl5/site_perl/Git/SVN/Fetcher.pm – alyda Jan 29 '16 at 22:29
  • 1
    @JanHudec The path seems to be relative to whatever repository URL has been specified on the command line, without leading slashes. – Tblue Feb 18 '16 at 13:45
9

I'm posting this for everyone, who was aslo trying to use --ignore-paths for fetching only specific branches/tags...

After a while struggling with --ignore-paths, which resulted in the following pattern to ignore all folders in branches folder, except folder branchname1 and branchname2:

--ignore-paths='branches/(?!branchname1|branchname2)'

Howerver, the correct solution is hiding at the bottom of the GIT SVN documentation:

It is also possible to fetch a subset of branches or tags by using a comma-separated list of names within braces. For example:

[svn-remote "huge-project"]
  url = http://server.org/svn
  fetch = trunk/src:refs/remotes/trunk
  branches = branches/{red,green}/src:refs/remotes/project-a/branches/*
  tags = tags/{1.0,2.0}/src:refs/remotes/project-a/tags/*

So in your case, .git/config should contain something like this:

tags = tags/{Acme-4.x,Acme-5.0,Acme-5.1,Acme-5.2,Acme-5.3,Acme-5.4}:refs/remotes/origin/tags/*
DarkPatronusLinX
  • 397
  • 2
  • 11
5

You might also just try:

cat .git/config

on Linux, or:

type .git\config

on Windows, from your new repository directory, to see the fetch URL, the branches and tags url.

nickhar
  • 19,981
  • 12
  • 60
  • 73
Toughy
  • 767
  • 6
  • 5
4

Would someone be kind enough to comment on why the regex's are not suppressing fetches on the paths specified in the ignores-regex.

This path

ACME4.4/source/database/mssql/components/functions/vssver.scc

was fetched despite the --ignore-paths argument because it just didn't match the regex.

There is no - between "ACME" and "4.4" in this path. And if the regex is case-sensitive, "ACME" won't match "Acme".

This should have worked better:

git svn fetch --ignore-paths="ACME4.x|ACME5.0|ACME5.1|ACME5.2|ACME5.3|ACME5.4"

Note that --ignore-paths targets file names, not tags.


(I bet you solved the issue long ago - this post is 4 years old).

Bludzee
  • 2,733
  • 5
  • 38
  • 46
4

I have a similar problem and a partial Solution for my case .. .

Context :
We have only one SVN repository for Meca, Hardware, Software team... the repository is a complete mess.. so I try to use regex to reduce the area to scan. After 1 day I just gave up.

Finally I used the include-path option to scan only folder with "*Src*" inside. which speed up the scan. also use the option :
-r to reduce the history size you will get in local.
--no-minimize-url otherwise git-svn will scan the whole repository even if you specify the trunk and branch location.

git svn clone 
-r11213:HEAD 
--prefix svn/
--no-minimize-url
--trunk=/trunk/dev/SW/Code/Controller1
--branches=/branches/SW_team/
--include-paths=.*Src.*
https://svnserver.compagny.com/Project1/
Controller1__git__

notice that right now i do not care of the Tags.

Hope it could help, even It's not the original question (5 years ago :-) )


EDIT: I cannot add a comment so I comment the question here (not enough reputation point)

1) --ignore-paths can be given for git svn [init/fetch or clone] (i do not know if there is a different behavior)
2) --ignore-paths expect a regex , be carefull the "." means any character. By chance the carater "." is also any character so regex=Acme-5.0 will match string="Acme-5.0" but also string="Acme-580", it should work anyway.

luney
  • 393
  • 1
  • 2
  • 8
1

I've struggled with the exact same problem and started editing .git/config to explicitly list the branches or tags that I want.

That approach worked well until I came across a svn repository with lots of branches, so I duly added the ones I wanted and left out the ones I did not. But this failed with configuration file errors. Trial and error apparently shows that there is a limit either the number of branches in the config file or more likely the total number of characters between the opening { and closing }.

My life would be much easier if I could just build regexs.

0

I have been experiencing strange problems with --ignore-paths too. git-svn seems to ignore the entire regexp in some cases. I have seen the same regexp working on repos 1 and ignored on repos 2, where both repos have the same file structure, but different history.

Although I don't see anything wrong with your regexp for your specific tree, I would recommend using the ^ caret at the beginning to specify the ignored paths starting at the root. This might help the rexexp parser speed-up the search and avoid issues where a match could also be found deep inside the trunk for example.

I would use something like --ignore-paths="^tags/Acme-(4|5.[0-4])"