2

I'm following an example in an awk book. I have the following bash script named "25regex.sh":

#!/usr/bin/env bash                                                      

# wh                                                                     
# wy                                                                     
# why                                                                    
# whhy                                                                   
# whhhy                                                                  
# whhhhy 

fname=25regex.sh                                                         

awk '/wh+y/ { print }' $fname                                            

echo                                                                     
awk '/wh{3}y/ { print }' $fname                                          

when I run the script it prints:

# why
# whhy
# whhhy
# whhhhy

awk '/wh{3}y/ { print }' $fname

I would expect the second awk command to print

# whhhy

So my question is, why is the second awk command being printed literally instead of being executed? Thanks.

The output of cat -A 25regex.sh is:

#!/usr/bin/env bash$
$
# wh$
# wy$
# why$
# whhy$
# whhhy$
# whhhhy$
$
fname=25regex.sh$
$
awk '/wh+y/ { print }' $fname$
$
echo$
# For some reason this is not working:$
awk '/wh{3}y/ { print }' $fname$
builder-7000
  • 7,131
  • 3
  • 19
  • 43
  • 3
    What do your end-of-line characters look like? I suspect you have something going on there. What's the output of `cat -A yourscript`, where `yourscript` is the name of the script file? If you have `^M` at the end of each line, those are carriage returns messing up things. – Benjamin W. Mar 02 '18 at 15:05
  • 1
    I can reproduce the problem using GNU `awk` with `echo "whhhy" | awk '/wh{3}y/'`, which produces no output. `echo "whhhy" | awk '/wh+y/'` produces the expected output. – chepner Mar 02 '18 at 15:29
  • 3
    From the manual: However, because old programs may use { and } in regexp constants, by default gawk does not match interval expressions in regexps. __If either --posix or --re-interval are specified__ (see Command-Line Options), then interval expressions are allowed in regexps. – jas Mar 02 '18 at 15:30
  • Cross-site dup, at least: https://askubuntu.com/questions/745241/does-gnu-awk-accept-intervals-specified-using-braces-in-regular-expressions – chepner Mar 02 '18 at 15:34
  • 1
    Note that `gawk` 4 and later allow interval expressions by default, in accordance with the POSIX specification. – chepner Mar 02 '18 at 15:36
  • The same example is an Awk's manual/doc and is explained : https://www.gnu.org/software/gawk/manual/gawk.html#index-EREs-_0028Extended-Regular-Expressions_0029 – z atef Mar 02 '18 at 15:47
  • @jas I forgot to mention I'm using mawk in Ubuntu, when I try the `--re-interval` (or `-r`) option I get `awk: not an option: -r` – builder-7000 Mar 02 '18 at 15:55
  • 1
    I can't say definitively, but it's very likely that `{n}` in regular expressions is simply not supported by mawk. Install gawk if you can or if not, any expression you can write with `{n}` can be written without, albeit more tediously. – jas Mar 02 '18 at 16:02
  • `man mawk`: _AWK uses extended regular expressions as with egrep(1). The regular expression metacharacters, i.e., those with special meaning in regular expressions are_: `^ $ . [ ] | ( ) * + ?` – James Brown Mar 03 '18 at 11:53

0 Answers0