A shorter, POSIX-compliant awk
solution, which is a generalized and optimized translation of @Tiago's excellent Perl-based answer.
One advantage of these answers over the sed
solutions is that they use literal substring matching rather than regular expressions, which allows passing in arbitrary search strings, without needing to worry about escaping. That said, if you did want regex matching, use the ~
operator rather than the index()
function; e.g., index($0, name)
would become $0 ~ name
. You then have to make sure that the value passed for name
either contains no accidental regex metacharacters meant to be treated as literals or is an intentionally crafted regex.
name='DOG' # Case-sensitive name to search for.
awk -v name="$name" '/^[^[:space:]]/ {if (p) exit; if (index($0,name)) {p=1}} p' file
- Option
-v name="$name"
defines awk
variable name
based on the value of shell variable $name
(awk
has no direct access to shell variables).
- Variable
p
is used as a flag to indicate whether the current line should be printed, i.e., whether it is part of the section of interest; as long as p
is not initialized, it is treated as 0
(false) in a Boolean context.
- Pattern
/^[^[:space:]]/
matches only header lines (lines that start with a non-whitespace character), and the associated action ({...}
) is only processed for them:
if (p) exit
exits processing altogether, if p
is already set, because that implies that the next section has been reached. Exiting right away has the benefit of not having to process the remainder of the file.
if (index($0, name))
looks for the name of interest as a literal substring in the header line at hand, and, if found (in which case index() returns the 1-based position at which the substring was found, which is interpreted as
truein a Boolean context), sets flag
pto
1(
{p=1}`).
p
simply prints the current line, if p
is 1
, and does nothing otherwise. That is, once the section header of interest has been found, it and subsequent lines are printed (up until the next section or the end of the input file).
Note that this is an example of a pattern-only command: only a pattern (condition) is specified, without an associated action ({...}
), in which case the default action is to print the current line, if the pattern evaluates to true. (That technique is used in the common shorthand 1
to simply unconditionally print the current record.)
If case-INsensitivity is needed:
name='dog' # Case-INsensitive name to search for.
awk -v name="$name" \
'/^[^[:space:]]/ {if(p) exit; if(index(tolower($0),tolower(name))) {p=1}} p' file
Caveat: The BSD-based awk
that comes with macOS (still applies as of 10.12.1) is not UTF-8-aware.: the case-insensitive matching won't work with non-ASCII letters such as ü
.
GNU awk
alternative, using the special IGNORECASE
variable:
awk -v name="$name" -v IGNORECASE=1 \
'/^[^[:space:]]/ {if(p) exit; if(index($0,name)) {p=1}} p' file
Another POSIX-compliant awk
solution:
name='dog' # Case-insensitive name of section to extract.
awk -v name="$name" '
index(tolower($0),tolower(name)) {inBlock=1; print; next} # 1st section line found.
inBlock && !/^[[:space:]]/ {exit} # Exit at start of next section.
inBlock # Print 2nd, 3rd, ... section line.
' file
Note:
next
skips the remaining pattern-action pairs and proceeds to the next line.
/^[[:space:]]/
matches lines that start with at least one whitespace char. As @Chrono Kitsune explains in his answer, if you wanted to match lines that start with exactly one whitespace char., use /^[[:space:]][^[:space:]]/
. Also note that, despite its name, character class [:space:]
matches ANY form of whitespace, not just spaces - see man isspace
.
- There's no need to initialize flag variable
inBlock
, as it defaults to 0
in numeric/Boolean contexts.
- If you have GNU
awk
, you can more easily achieve case-insensitive matching by setting the IGNORECASE
variable to a nonzero value (-v IGNORECASE=1
) and simply using index($0, name)
inside the program.
A GNU awk
solution, IF, you can assume that all section header lines start with 'ip'
(so as to break the input into sections that way, rather than looking for leading whitespace):
awk -v RS='(^|\n)ip' -F'\n' -v name="$name" -v IGNORECASE=1 '
index($1, name) { sub(/\n$/, ""); print "ip" $0; exit }
' file
-v RS='(^|\n)ip'
breaks the input into records by lines that fall between line-starting instances of string 'ip'
.
-F'\n'
then breaks each record into fields ($1
, ...) by lines.
index($1, name)
looks for the name on the current record's first line - case-INsensitively, thanks to -v IGNORECASE=1
.
sub(/\n$/, "")
removes any trailing \n
, which can stem from the section of interest being the last in the input file.
print "ip" $0
prints the matching record, comprising the entire section of interest - since, however the record doesn't include the separator, 'ip'
, it is prepended.