2

I hit an issue when I run the mawk on Ubuntu 1604:

echo "123-456" | mawk '$0~/^[0-9]{3}/ {print $0}'  

The above command output nothing although the regular pattern matched actually.

Then I tried to run the egrep with the same regular pattern:

echo "123-456" | egrep '^[0-9]{3}'  

It works fine!

Then I looked up the doc of the mawk, it seems the root cause is "Interval expressions were not traditionally available in awk.". The field "{3}" in the regular pattern cause the issue. If I use "[0-9][0-9][0-9]" instead of "[0-9]{3}": , it works fine.
https://invisible-island.net/mawk/manpage/mawk.html https://www.math.utah.edu/docs/info/gawk_5.html

I tried the option --posix' and--re-interval' for the mawk, they don't work both.
Is it possible that can enable the "Interval expressions" in the mawk? My OS is "Ubuntu 16.04.4", the mawk is "1.3.3-17ubuntu2".

Thanks.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
yw5643
  • 189
  • 1
  • 12
  • Sorry, but why not use `awk`? What is your concrete problem? You already know `mawk` regex does not support range (limiting) quantifiers, so why not use the right tool? – Wiktor Stribiżew Jun 27 '18 at 11:29
  • 1
    @WiktorStribiżew `mawk` claims that it supports extended posix regular expressions. extended (as basic) posix regular expressions *do* support range (*bound*) expressions – hek2mgl Jun 27 '18 at 11:46
  • @hek2mgl So, did you manage to make it work? – Wiktor Stribiżew Jun 27 '18 at 11:51
  • 1
    No, `mawk` is broken as it seems or at least claims too much. But I totally understand the surprise of the OP that the - nowadays standard (awk == mawk) - awk interpreter does not work as it claims in the manual. I'd personally feel like I'm doing something wrong it that case which is the perfect moment to ask a question on SO – hek2mgl Jun 27 '18 at 11:55

2 Answers2

0

UPDATED : much cleaner solution :

[g/n/m]awk '$-_~"^"(_="[0-9]")(_)_' 

only slightly longer than egrep syntax

 awk '$-_~"^"(_="[0-9]")(_)_'

 egrep '^[0-9]{3}'

a very hideous solution would be

echo "123-456" | {mawk/mawk2} 'BEGIN { FS = "^$" } /^[0-9][0-9][0-9]/' 

another would be even more clunky

echo "123-456" | {mawk/mawk2} 'BEGIN { FS = "^$" 

    } match($0, "^[0-9]+") && (RLENGTH >= 3)' 

It's very un-ideal of course. Stick with gawk if you have access to it for RE intervals.

RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
0

Trying with the new version this now works

% mawk -W version
mawk 1.3.4 20230203
regex-funcs:        internal
% echo "123-456" | mawk '$0~/^[0-9]{3}/ {print $0}'
123-456
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29