Defining a function which prints lines enclosed by two different regular expression patterns in bash using awk

Question

For example if I want to print in file drivers/pci/controller/dwc/pci-meson.c, the lines starting with dw_pcie_ops.*= and ending with ^}, I can do

$ awk '/dw_pcie_ops.*=/{inblk=1} inblk==1&&/^}/{print $0; inblk=0} inblk==1{print $0}' drivers/pci/controller/dwc/pci-meson.c
static const struct dw_pcie_ops dw_pcie_ops = {
    .link_up = meson_pcie_link_up,
    .start_link = meson_pcie_start_link,
};

You see the expected result above. I occasionally use this type of command with for each statement and with file name prints like ### filename ### marks and another awk command in pipeline to remove the filename-only lines.
Now because I use it frequently, I thought maybe I can define a function in bash for doing this(finding lines enclosed by two regular expressions in a file). So I tried

# usage : patinpat pat1 pat2 filename
function patinpat ( ) {
echo 'running patinpat'
echo '$1 = ' $1
echo '$2 = ' $2
echo '$3 = ' $3
awk '/"$1"/{inblk=1} inblk==1&&/"$2"/{print $0; inblk=0} inblk==1{print $0}' "$3"
}

But when I do

$ patinpat dw_pcie_ops.*= ^} drivers/pci/controller/dwc/pci-meson.c
running patinpat
$1 =  dw_pcie_ops.*=
$2 =  ^}
$3 =  drivers/pci/controller/dwc/pci-meson.c

I can see the function arguments were passed ok, but because the awk command also uses the words in the line as $0, $1, $2, .. it cannot differentiate the bash function argument from the words in the line. How can I do this??

ADD : For test, let's just say file drivers/pci/controller/dwc/pci-meson.c 's content is :

static int meson_pcie_host_init(struct pcie_port *pp)
{
    struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
    struct meson_pcie *mp = to_meson_pcie(pci);

    pp->bridge->ops = &meson_pci_ops;

    meson_set_max_payload(mp, MAX_PAYLOAD_SIZE);
    meson_set_max_rd_req_size(mp, MAX_READ_REQ_SIZE);

    return 0;
}


static const struct dw_pcie_host_ops meson_pcie_host_ops = {
    .host_init = meson_pcie_host_init,
};

static const struct dw_pcie_ops dw_pcie_ops = {
    .link_up = meson_pcie_link_up,
    .start_link = meson_pcie_start_link,
};

similar issue as with any `bash` level command that needs to expand (`bash`) variable references ... `awk '/"$1"/{inblk=1} ....'` .... the outer single quotes (ie, the delimiters for the `awk` script) keep the `$1` from being expanded so `awk` ends up looking for the literal string `"$1"`; while you could wrap the `awk` script in double quotes you now need to go about escaping nested double quotes, and `awk` field references that start with `$`, etc; to pass `bash` variables for use within an `awk` script there are other options: `-v "${var}"` and `ENVIRON["var"]` (see examples below) — markp-fuso, Jun 10 '23 at 20:16

Ed Morton · Accepted Answer · 2023-06-10T13:18:04.987

4

Try this:

$ cat ggg
#!/usr/bin/env bash

patinpat() {
    local beg end

    printf 'running patinpat\n'
    printf '$1 = %s\n' "$1"
    printf '$2 = %s\n' "$2"
    printf '$3 = %s\n' "$3"

    beg="$1" end="$2" awk '
        BEGIN {
            beg = ENVIRON["beg"]
            end = ENVIRON["end"]
        }
        $0 ~ beg {
            inblk = 1
        }
        inblk {
            print
            if ( $0 ~ end ) {
                inblk = 0
            }
        }
    ' "$3"
}

patinpat "$@"

$ ./ggg 'dw_pcie_ops.*=' '^}' 'file'
running patinpat
$1 = dw_pcie_ops.*=
$2 = ^}
$3 = file
static const struct dw_pcie_ops dw_pcie_ops = {
    .link_up = meson_pcie_link_up,
    .start_link = meson_pcie_start_link,
};

See How do I use shell variables in an awk script? for the different ways to access the values of shell variables in an awk script.

Try to avoid using the word "pattern" (or "pat") in your code, requirements, etc. though as it's ambiguous - see How do I find the text that matches a pattern?. If you want your code to do something with regexps, then use the word regexp.

edited Jun 10 '23 at 13:18

answered Jun 10 '23 at 11:08

Ed Morton

188,023
17
78
185

OK, there was a missing `"` - I updated it. – Ed Morton Jun 10 '23 at 13:18
Yeah, that works! The magic seems to be in the `ENVIRON` thing. Thank you. – Chan Kim Jun 10 '23 at 14:59
If you use `awk`'s `-v` option, you don't need `BEGIN` section. – Philippe Jun 10 '23 at 21:10
@Philippe that depends on what you want to do. Using `-v` will interpret escape sequences so `\t` would become a literal tab, for example, while using `ENVIRON[]` will keep the regexps literal which is why I chose to use it for this particular case where a user would by typing regexps to pass into the script and so probably don't expect any translation. – Ed Morton Jun 11 '23 at 00:02
Exact. But that can be coped with by escaping backslash : `\\t` – Philippe Jun 11 '23 at 09:06

score 0 · Answer 2 · answered Jun 10 '23 at 11:47

0

I want to explain what your code is actually doing, in command

awk '/"$1"/{inblk=1} inblk==1&&/"$2"/{print $0; inblk=0} inblk==1{print $0}' "$3"

you are using $ inside following regular expressions

/"$1"/
/"$2"/

$ is used to denote end of string, but you follow it with 1 and 2 respectively (meaning of these is literal) therefore they will never match, as there are never characters after end of line.

answered Jun 10 '23 at 11:47

Daweo

31,313
3
12
25

so my question is 'how can I pass the argument from bash ($1) as pattern in awk command?' – Chan Kim Jun 10 '23 at 12:25
@ChanKim stop using the word "pattern" in this context - always use "string" or "regexp", whichever it is you mean, as how to use a bash variable as a string or a regexp are 2 different questions. – Ed Morton Jun 10 '23 at 13:19

Philippe · Answer 3 · 2023-06-12T13:14:26.700

0

You can use range operator , : (following script in patinpat)

#!/usr/bin/env bash
awk -v begin="$1" -v end="$2" '($0 ~ begin), ($0 ~ end)' "$3"

Run with

patinpat 'dw_pcie_ops.*=' '^}' drivers/pci/controller/dwc/pci-meson.c

edited Jun 12 '23 at 13:14

answered Jun 10 '23 at 13:59

Philippe

20,025
2
23
32

1

@EdMorton That's even simpler, thanks for pointing out! – Philippe Jun 12 '23 at 13:15

score 0 · Answer 4 · answered Jun 12 '23 at 08:30

beg='1.'
end='9$'

jot 30 |

mawk 'NF <= !_ ? __ : $0 ~ end ? __*(__ = +_)^_ \
               : __ = !_'  end="$end" FS="${beg}|${end}"

or in this specific example, further simplify it to :

mawk '$0 ~ beg, $0 ~ end' beg="$beg" end="$end"

Defining a function which prints lines enclosed by two different regular expression patterns in bash using awk

4 Answers4