10

I gave a list of globs and one string to Perl's glob function. The globs were treated as expected but the string is always found. For example:

$ ls
foo
$ perl -le '@files=glob("*bar"); print @files' ## prints nothing, as expected
$ perl -le '@files=glob("bar"); print @files'
bar

As you can see above, the second example prints bar even though no such file exists.

My first thought is that it behaves like the shell in that when no expansion is available, a glob (or something being treated as a glob) expands to itself. For example, in csh (awful as it is, this is what Perl's glob() function seems to be following, see the quote below):

% foreach n (*bar*)
foreach: No match.

% foreach n (bar)
foreach? echo $n
foreach? end
bar                     ## prints the string

However, according to the docs, glob should return filename expansions (emphasis mine):

In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do.

So why is it returning itself when there are no globbing characters in the argument passed to glob? Is this a bug or am I doing something wrong?

terdon
  • 3,260
  • 5
  • 33
  • 57
  • Why would you pass a single constant value (with no wildcards) to `glob` like that? – toolic Jul 06 '16 at 17:55
  • 2
    @toolic to keep my question simple. The actual usecase is slightly more complicated. In my script, I am passing an array like `("*foo", "*bar", "string")`, so various globs and one exact file name. I can think of various workarounds, but am curious as to why `glob()` behaves like this only on the simple string. – terdon Jul 06 '16 at 17:59
  • This is nit-picky, but both `"*bar"` and `"bar"` are strings. You meant to say "Why does Perl's glob() function always return a file name when given a string with no wildcards instead of a string with wildcards?" – ThisSuitIsBlackNot Jul 06 '16 at 20:48
  • @ThisSuitIsBlackNot fair enough, a glob is as a glob does, after all. Whether something is a string or a glob depends on the context in which it is used. Since it was, apparently, used as a simple string and not a glob here, I called it so. But you're right, I could have been more precise. – terdon Jul 06 '16 at 20:54
  • "bar" is already fully expanded and valid as a filename. Only when adding wildcards does the system need to look at the file system to find matches. – dsm Jul 07 '16 at 00:35
  • @dsm "Only when adding wildcards does the system need to look at the file system to find matches." Not true. Perl checks the file system whether there are wildcards or not. – ThisSuitIsBlackNot Jul 07 '16 at 02:09

2 Answers2

11

When you use ? or * or [], only existing files or directories will be returned. When your pattern just has literal text or {}, all possible results will be returned. This exactly matches what csh does.

Often, people will do @results = grep -e, glob PATTERN because of this.

Or you can use File::Glob::bsd_glob if you want more control over this. (Note that there is no additional overhead to doing this; since perl 5.6 when you use glob() perl quietly loads File::Glob and uses it.)

ysth
  • 96,171
  • 6
  • 121
  • 214
  • Fair enough, thanks. I now found the relevant POSIX section which I should have checked in the first place. I guess I expected Perl to be checking for file existence in the background. – terdon Jul 06 '16 at 18:07
  • 1
    I confirm bash does that too. If you want to avoid this kind of behaviours, you need to use the `GLOB_BRACE` option with `File::Glob::bsd_glob`. – Casimir et Hippolyte Jul 06 '16 at 20:17
  • @terdon POSIX sh filename expansion is different from what perl does (it has no {} support and returns the pattern if no filenames match even with `?`, `*`, or `[]`); as mentioned in the perl documentation you quote, perl is trying for compatibility with csh (for which I can't find a good link) – ysth Jul 06 '16 at 20:40
  • 1
    @CasimiretHippolyte Actually, if there are no matches, bash returns the pattern even if it contains wildcards, which is different from csh: `echo f*; touch foo; echo f*`. You can change this by setting `failglob`. With File::Glob, you control this behavior with `GLOB_NOCHECK` or `GLOB_NOMAGIC`. – ThisSuitIsBlackNot Jul 06 '16 at 20:41
  • @ysth ah, true enough. I keep trying to forget about csh. Sorry about the unaccept, by the way, but the other answer gave some useful detail (and I doubt you really need the rep ;). – terdon Jul 06 '16 at 20:55
8

I guess I expected Perl to be checking for file existence in the background.

Perl is checking for file existence:

$ strace perl -e'glob "foo"' 2>&1 | grep foo
execve("/home/mcarey/perl5/perlbrew/perls/5.24.0-debug/bin/perl", ["perl", "-eglob \"foo\""], [/* 39 vars */]) = 0
lstat("foo", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0

So why is it returning itself when there are no globbing characters in the argument passed to glob?

Because that's what csh does. Perl's implementation of glob is based on glob(3) with the GLOB_NOMAGIC flag enabled:

GLOB_NOMAGIC

Is the same as GLOB_NOCHECK but it only appends the pattern if it does not contain any of the special characters *, ? or [. GLOB_NOMAGIC is provided to simplify implementing the historic csh(1) globbing behavior and should probably not be used anywhere else.

GLOB_NOCHECK

If pattern does not match any pathname, then glob() returns a list consisting of only pattern...

So, for a pattern like foo with no wildcards:

  • if a matching file exists, the filename expansion (foo) is returned
  • if no matching file exists, the pattern (foo) is returned

Since the filename expansion is the same as the pattern,

glob 'foo'

in list context will always return a list with the single element foo, whether the file foo exists or not.

Community
  • 1
  • 1
ThisSuitIsBlackNot
  • 23,492
  • 9
  • 63
  • 110
  • this answer does not take into account patterns like '{foo,bar}' – ysth Jul 06 '16 at 21:19
  • @ysth The OP didn't ask about `{}`, they asked about a pattern with no wildcards. But I see how my answer could have been unclear; edited to make it clear that I'm only talking about patterns with no wildcards. – ThisSuitIsBlackNot Jul 06 '16 at 21:56