31

The following command is working as expected.

# some command | awk '/(\<^create\>|\<^alter\>|\<^drop\>)/,/;/' 
create table todel1 (id int) max_rows=2
/*!*/;
alter table todel1 engine=InnoDB
/*!*/;
create database common
/*!*/;
create database rules
/*!*/;

But it matches only the lower case "create", "alter" etc. I want to use IGNORECASE switch in the awk statement so that it will return all instances of the search term.

shantanuo
  • 31,689
  • 78
  • 245
  • 403
  • 1
    The example in the accepted answer is mistakenly evaluating `IGNORECASE = 1` as a condition (with side effect), rather than as a statement in a block. This condition is truthy, and will result in every line being printed *at least* once. – mwfearnley Sep 30 '18 at 14:50

4 Answers4

25

Add IGNORECASE = 1; to the beginning of your awk command like so:

bash-3.2$ echo "Create" | awk '/^create/;'
bash-3.2$ echo "Create" | awk 'IGNORECASE = 1;/^create/;'
Create
Andrew Yochum
  • 1,056
  • 8
  • 13
  • 21
    Set it in the `BEGIN` block or on the command line since it doesn't need to be executed for each line of input. – Dennis Williamson Mar 08 '11 at 06:14
  • 5
    Note that this is a `gawk`ism. And to Dennis' second point, he means something along the lines of: `awk '/bunch of regex here/' IGNORECASE=1` – SiegeX Mar 08 '11 at 06:19
  • 2
    This does not work in awk version 20070501, at least. `echo "No match" | awk 'IGNORECASE = 1;/^create/;'` gives `No match`. It doesn't seem add the implicit if-statement if you have anything in addition to the regex. – ceyko Apr 23 '14 at 22:27
  • 4
    As @ceykooo said this not working for me either. But this is working for me: `echo "No match" | awk 'tolower($0) ~ /^create/'` – Daniel Pérez Rada Jun 07 '14 at 15:51
  • This worked for me: awk -v RS= -v ORS='\n\n' '/search_item/' IGNORECASE=1 "${search_file}" > ./out.txt (i was pulling out records between spaces in a file based on search string. – Mike Q Sep 10 '14 at 18:51
  • 3
    Set it in the BEGIN block or any block for that matter, or it doesn't work properly. When it is not in a { } block, all lines of text are matched in gawk 3.1.6 and 4.1.1 and probably universally. I. E. echo -e "a\nb\nc" | awk 'IGNORECASE = 1; /B/' - outputs four lines containing a, b, b, c! echo -e "a\nb\nc" | awk 'BEGIN { IGNORECASE = 1 } /B/' - only outputs one line containing b. – kbulgrien Apr 09 '16 at 06:44
  • 2
    FYI, this did not work properly for me when using `mawk`. I installed `gawk` and all is now right with this world... – Digger Nov 13 '17 at 06:23
  • Thanks @DanielPérezRada. This is a far better solutions since it is defined in POSIX and thus not gawk specific. Also of course, it targets the specific comparison – TheMadsen Jan 19 '18 at 09:45
  • 2
    Just to say, I marked this answer down. The approximate reasons are given above, but specifically for me it was because it evaluates `IGNORECASE = 1` as a conditional expression when it should be in a block or set as a `gawk -v` variable. As a condition, it will cause an implicit `{print $0}` to occur for every line. – mwfearnley Oct 06 '18 at 13:13
  • echo -e "Create\na" | awk 'BEGIN{IGNORECASE = 1;}/^create/;' – qxo Oct 19 '18 at 07:49
  • @mwfearnley you are correct! – kvantour Sep 28 '21 at 13:42
20

The following line executes an OR test instead of an AND :

echo -e "Create\nAny text" | awk 'IGNORECASE = 1;/^create/;'
Create
Create
Any text

The BEGIN special word solved the problem :

echo -e "Create\nAny text" | awk 'BEGIN{IGNORECASE = 1}/^create/;'
Create

Hope this helps.

Sebastien.

SebMa
  • 4,037
  • 29
  • 39
  • 4
    It's not so much an OR test, rather it just evaluates two expressions (one of which always evaluates to true), and so prints each input line either once or twice. – mwfearnley Sep 30 '18 at 14:42
11

For those who have an old awk where the IGNORECASE flag is useless:

Option 1

echo "CreAte" | awk '/^[Cc][Rr][Ee][Aa][Tt][Ee]/'

Option 2 (thanks @mwfearnley)

echo "CreAte" | awk 'tolower($0) ~ /^create/'
Juan Diego Godoy Robles
  • 14,447
  • 2
  • 38
  • 52
11

This is a bit late, but two answers to this question (including the accepted answer) mention doing awk 'IGNORECASE=1;...' - i.e. putting IGNORECASE=1 as a condition, instead of a statement in a block.

This should not be done. It does set the variable as intended, but it also (as unintended) evaluates it as a boolean expression, returning true.

A true condition without a block will cause the line to always be printed. If it happens to match the following pattern, it will also be printed a second time.

What the accepted answer probably meant was awk '{IGNORECASE=1} ...', which sets the IGNORECASE variable on each line of text. This can be further improved by using the BEGIN condition to assign it only once. But a cleaner solution is to use the -v flag to set the parameter outside of the script logic:

awk -v IGNORECASE=1 '/(\<^create\>|\<^alter\>|\<^drop\>)/, /;/'

Note that IGNORECASE is specific to gawk. For a non gawk-specific method, the GNU Awk User's Guide suggests using tolower in a pattern match:

awk '(tolower($0) ~ /(\<^create\>|\<^alter\>|\<^drop\>)/), /;/'
mwfearnley
  • 3,303
  • 2
  • 34
  • 35