It isn't entirely clear what you're asking, but from the context it looks like you want a pattern that can be used to search for multiple entries in an array, since readlines
returns an array containing the lines of the file.
A simple pattern example would be:
%w[foo bar baz].grep(/foo|bar/) # => ["foo", "bar"]
|
means "or", so the pattern /foo|bar/
is looking for "foo" or "bar"
. grep
will iterate over the array ['foo', 'bar']
, and finds both.
This isn't the entire solution because there are dragons waiting in the woods. /foo|bar/
are actually matching substrings, not complete words:
%w[food bartender].grep(/foo|bar/) # => ["food", "bartender"]
which is most likely not what you want.
To fix this we have to tell the regex engine to only find words:
%w[foo bar baz].grep(/\bfoo\b|\bbar\b/) # => ["foo", "bar"]
%w[food bartender].grep(/\bfoo\b|\bbar\b/) # => []
The \b
means a "word-boundary" which is the transition between a non-word character and a word character. \w
is the pattern used, and it's defined in the Regexp documentation. I STRONGLY recommend reading about that as there are additional potential issues you can run into. For our purposes though \b
and the default behavior is probably fine.
There's a lot of duplication in that little pattern though, and regular expressions let us trim out the replication:
%w[foo bar baz].grep(/\b(foo|bar)\b/) # => ["foo", "bar"]
%w[food bartender].grep(/\b(foo|bar)\b/) # => []
Using the parenthesis groups foo|bar
into a capture-group, so the surrounding \b
will be applied to anything inside the parenthesis, reducing the noise.
Sometimes you don't want to actually capture the string, you just want to match it. If that's the case read about non-capturing groups in the documentation.