To adjust the question a bit to make it clearer (at least to me), we have any number of /bin/somename/...
and .../bin/anothername/...
names that should be ignored, along with three sets of .../bin/folder1/...
, .../bin/2folder/...
, and .../Bin/third/...
set of names that should not be ignored.
Hence, we want a regular expression that (without anchoring) will match the names-to-be-ignored but not the ones-to-be-kept. (Furthermore, glob matching won't work, since it's not as powerful: we'll either match too little or too much, and Mercurial lacks the "override with later un-ignore" feature of Git.)
The shortest regular expression for this should be:
/[Bb]in/(?!(folder1|2folder|third)/)
(The part of this regex that actually matches a string like /bin/somename/...
is only the /bin/
part, but Mercurial does not look at what matched, only whether something matched.)
The thing is, your example regular expression should work, it's just a longer variant of this same thing with not-required but harmless (except for performance) .*
added at the front and back. So if yours isn't working, the above probably won't work either. A sample repository, with some dummy files, that one could clone and experiment with, would help diagnose the issue.
Original (wrong) answer (to something that's not the question)
The shortest regular expression for the desired case is:
/[Bb]in/Folder[123]/
However, if the directory / folder names do not actually meet this kind of pattern, we need:
/[Bb]in/(somedir|another|third)/
Explanation
First, a side note: the default syntax is regexp, so the initial syntax: regexp
line is unnecessary. As a result, it's possible that your .hgignore
file is not in proper UTF-8 format: see Mercurial gives "invalid pattern" error for simple GLOB syntax. (But that would produce different behavior, so that's probably a problem. It's just worth mentioning in any answer about .hgignore
files malfunctioning.)
Next, it's worth noting a few items:
Mercurial tracks only files, not directories / folders. So the real question is whether any given file name matches the pattern(s) listed in .hgignore
. If they do match, and the file is currently untracked, the file will not be automatically added with a sweeping "add everything" operation, and Mercurial will not gripe that the file is untracked.
If some file is already tracked, the fact that its name matches an ignore pattern is irrelevant. If the file a/b/c.ext
is not tracked and does match a pattern, hg add a/b/c.ext
will add it anyway, while hg add a/b
will en-masse add everything in a/b
but won't add c.ext
because it matches the pattern. So it's important to know whether the file is already tracked, and consider what you explicitly list to hg add
. See also How to check which files are being ignored because of .hgignore?, for instance.
Glob patterns are much easier to write correctly than regular expressions. Unless you're doing this for learning or teaching purposes, or glob is just not powerful enough, stick with the glob patterns. (In very old versions of Mercurial, glob matching was noticeably slower than regexp matching, but that's been fixed for a long time.)
Mercurial's regexp ignore entries are not automatically anchored: if you want anchored behavior, use ^
at the front, and $
at the end, as desired. Here, you don't want anchored behavior, so you can eliminate the leading and trailing .*
. (Mercurial refers to this as rooted rather than anchored, and it's important to note that some patterns are anchored, but .hgignore
ones are not.)
Python/Perl regexp (?!...)
syntax is the negation syntax: (?!...)
matches if the parenthesized expression doesn't match the string. This is part of the problem.
We need not worry about capturing groups (see capturing group in regex) as Mercurial does nothing with the groups that come out of the regular expression. It only cares if we match.
Path names are really slash-separated components. The leading components are the various directories (folders) above the file name, and the final component is the file name. (That is, try not to think of the first parts as folders: it's not that it's wrong, it's that it's less general than "components", since the last part is also a component.)
What we want, in this case, is to match, and therefore "ignore", names that have one component that matches either bin
or Bin
followed immediately by another component that matches Folder1
, Folder2
, or Folder3
that is followed by a component-separator (so that we haven't stopped at /bin/Folder1
, for instance, which is a file named Folder1
in directory /bin
).
The strings bin
and Bin
both end with a common trailing part of in
, so this is recognizable as (B|b)in
, but single-character alternation is more easily expressed as a character class: [Bb]
, which eliminates the need for parentheses and vertical-bars.
The same holds for the names Folder1
, Folder2
, and Folder3
, except that their common string leads rather than trails, so we can use Folder[123]
.
Suppose we had anchored matches. That is, suppose Mercurial demanded that we match the whole file name, which might be, say, /foo/hello/bin/Folder2/bar/world.ext
. Then we'd need .*/[Bb]in/Folder[123]/.*
, because we'd need to match any number of characters to skip over /foo/hello
before matching /bin/Folder2/
, and again skip over any number of characters to match bar/world.ext
, in order to match the whole string. But since we don't have anchored matches, we'll find the pattern /bin/Folder2/
within the whole string, and hence ignore this file, using the simpler pattern without the leading and trailing .*
.