Can I restrict grep (or map) to first match when I'm expecting only one match?

Question

I am literally brand new to Perl. I have the following...

#!/usr/bin/perl

#config file
$config = "$ENV{'HOME'}/.config/.wallpaperrc";

open my $handle, '<', $config;
chomp(my @lines = <$handle>);
close $handle;

@regular = map(/^regular\.roll (.*)$/, @lines);

print(@regular);

It works but it seems awkward and improper to be using an array when I am only expecting one match and only want one match. If I make @regular scalar then the function returns the number of matches instead.

I tried to search for the answer but the results are muddied with all the questions using Perl grep within Bash.

Tip: You really should check if `open` succeeds or not, if only by adding `or die $!`. It's one of the most likely thing to fail — ikegami, Sep 04 '19 at 01:05

mob · Accepted Answer · 2019-09-04T00:38:10.263

7

You can capture a single match by assigning to a scalar in list context

($regular) = map(/^regular\.roll (.*)$/, @lines);

The parentheses on the left hand side are important, otherwise you are imposing scalar context on the right hand size and the result will be something else, like the number of elements.

If you're trying to capture the first match from grep (but not map) and you are more comfortable using Perl modules, the first function in the List::Util package returns the first match, and is more efficient than calling grep and discarding all the extra matches.

use List::Util 'first';
...
$regular = first { /pattern/ } @input;

edited Sep 04 '19 at 00:38

answered Sep 04 '19 at 00:09

mob

117,087
18
149
283

So kind of like casting a type after the operation? – deanresin Sep 04 '19 at 00:11
Can `first` return capture groups? – deanresin Sep 04 '19 at 01:16

GMB · Answer 2 · 2019-09-04T00:08:51.660

4

You could assign the results of the operation to a list that contains just one element:

my ($regular) = map(/^regular\.roll (.*)$/, @lines);
print $regular;

edited Sep 04 '19 at 00:08

answered Sep 04 '19 at 00:06

GMB

216,147
25
84
135

Neat, it worked. Can you explain what your syntax is doing? – deanresin Sep 04 '19 at 00:08
1

And you can make sure found something using `my ($regular) = map(...) or die "No matching lines\n";` – ikegami Sep 04 '19 at 01:06

zdim · Answer 3 · 2019-09-10T04:57:46.240

Note See the end for how to stop right after the first match (one statement, with a module)

In order for the regex match operator to return the capture(s) themselves it indeed need be invoked in list context. But then you can form that list as you wish -- with just one scalar for instance, to catch only one from the returned list of scalars

my ($regular) = map { /^regular\.roll (.*)/ } @lines;

Here the ($v1, $v2,...) on the LHS provides the list context for the assignment operator, and with only one variable the first of the returned list of (.*) captures is assigned and the rest discarded.

This above has mostly been stated already but I think that it is important to comment on a few other things in the question as well.

Always have use warnings; and use strict; at the beginning of a program
An open statement must be tested for failure, and if it failed you print the error. Commonly
```
open my $fh, '<', $file  or die "Can't open $file: $!";
```
I suggest to chomp in a separate statement
There is no reason for $ anchor in that regex (except with multiline string and /m modifier)
When printing, if you put it under quotes it's interpolated with spaces (see $,) in between
```
say "@regular";
```
or, print each element on its own line
```
say for @regular;
```
In order to be able to use say feature you need use feature qw(say);

Since only the first match is needed we'd rather not go through the rest of the list once a match is found. This can be achieved using first_result from List::MoreUtils (riffing off of mob's idea)

my $regular = firstres { my ($m) = /^regular\.roll (.*)/; $m } @lines;

The syntax inside the block is a little wordy but returning $1 after a lone regex didn't work for me (?). If having two statements is a bother this can be shortened, at the expense of readability

my $regular = firstres { ( /^regular\.roll (.*)/ )[0] } @lines;

where () around the regex provide for list context, and [0] takes the first element of that list. I added spaces around regex to try to alleviate that syntax a little; they aren't needed.

Thanks for the info. It annoys me Perl doesn't just use `use strict`, `use warnings` by default. — deanresin, Sep 04 '19 at 00:56
@deanresin My sentiment as well. I have a shortcut in my vim (`wss`, warning-strict-say) which enters them. Somewhere in docs it says, half jokingly, that `warnings` not being enforced is a bug — zdim, Sep 04 '19 at 00:59
@deanresin It can't for backwards compatibility. That said, `use 5.012;` and higher effectively does `use strict;` for you. I so wish they made that enable warnings too. (They can always be turned off using `no warnings;` if you happen to want them off.) — ikegami, Sep 04 '19 at 01:07
@deanresin There are "frameworks" (large modules I mean) which enable them, for example `Moose` (or `Moo`) etc. But, yeah, as ikegami perfectly puts it "_I so wish..._" ... — zdim, Sep 04 '19 at 01:17

score 0 · Answer 4 · answered Sep 04 '19 at 00:31

You can use standard foreach loop and terminate it when the match is found.

use strict; use warnings;

# sample array to be searched
my @array = qw( A B C );

my $match;  # variable to hold matching element
# "last" terminates the loop when /B/ pattern matches
# print below is only for debug purposes to show which elements are tested
print("? $_\n") and /B/ and $match = $_ and last foreach @array;
# below is short version
# /B/ and $match = $_ and last foreach @array;

# print $match if it is defined (if it have been assigned in foreach loop)
print "MATCH: $match\n" if defined($match);

Otheus · Answer 5 · 2022-07-21T12:45:37.267

You can optimize the grep / map somewhat by memoizing the result of the first match. Unfortunately, the rest of the items in the array are still processed, but each item is not fully evaluated thanks to shortcut boolean logic.

$ perl -w -lane 'local($first); print join(" ","** ",' \
      -e  'grep { $first=1 if !$first and /^[A-Z]+$/ } @F)'
no No .NO. -NO- n0 no
**
YES no
** YES
no YES NO
** YES

In this example, we want to grep each line of input (-n) for the first word (-a splits the line into words and saves in @F) that matches the pattern of only upper-case letters. Without the memoization, we would get "NO" in the output of the final line.

If the array is very long, you will save some CPU cycles/time since the grep expression will only evaluate ! $first, which will be true after the first match. The rest of the expression will not be evaluated.

For using with map, you do need to be cautious when the input is a string that equates to 0 string. See here:

perl -w -lane 'local($first); print join(" ","** ",' -e  'map { $first=$_ if !$first and /^[A-Z0-9]+$/ } @F)'
0 NO
**  0 NO

(This is BAD!).

On the second item, perl interprets !$first as !0, even though the 0 was a string.

So, with map, to be on the safe side, use ! defined $first.

perl -w -lane 'local($first); print join(" ","** ",' -e  'map { $first=$_ if !defined $first and /^[A-Z0-9]+$/ } @F)'
0 NO
**  0

(correct)

Can I restrict grep (or map) to first match when I'm expecting only one match?

5 Answers5