2

For my case, if a certain pattern is found as the second field of one line in a file, then I need print the first two fields. And it should be able to handle case with special symbol like backslash.

My solution is first using sed to replace \ with \\, then pass the new variable to awk, then awk will parse \\ as \ then match the field 2.

escaped_str=$( echo "$pattern" | sed 's/\\/\\\\/g')    
input | awk -v awk_escaped_str="$escaped_str"  '$2==awk_escaped_str { $0=$1 " "     $2 " "}; { print } '

While this seems too complicated, and cannot handle various case.

Is there a better way which is more simpler and could cover all other special symbol?

Qiu Yangfan
  • 871
  • 11
  • 25

2 Answers2

3

The way to pass a shell variable to awk without backslashes being interpreted is to pass it in the arg list instead of populating an awk variable outside of the script:

$ shellvar='a\tb'

$ awk -v awkvar="$shellvar" 'BEGIN{ printf "<%s>\n",awkvar }'
<a      b>

$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]=""; printf "<%s>\n",awkvar }' "$shellvar"
<a\tb>

and then you can search a file for it as a string using index() or ==:

$ cat file
a       b
a\tb

$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]="" } index($0,awkvar)' "$shellvar" file
a\tb

$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]="" } $0 == awkvar' "$shellvar" file
a\tb

You need to set ARGV[1]="" after populating the awk variable to avoid the shell variable value also being treated as a file name. Unlike any other way of passing in a variable, ALL characters used in a variable this way are treated literally with no "special" meaning.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Thanks, it works. Well as I tried, this also works: awk 'BEGIN{ printf "<%s>\n",ARGV[1] }' "$shellvar" So, is there any advantage that passing the ARGV[1] to awkvar, then using awkvar in awk script? Or both are OK. – Qiu Yangfan Aug 07 '14 at 15:52
  • Note what I said towards the bottom of my answer - you MUST set `ARGV[1]=""` before the end of the BEGIN section or awk will try to open whatever value it contains as a file, so unless you plan to do all of your processing in the BEGIN section you have no choice but to populate an awk variable from ARGV[1] before clearing it. – Ed Morton Aug 07 '14 at 16:04
  • 1
    Beautiful!! I just used this clever approach in [an answer](http://stackoverflow.com/a/34639482/1983854). – fedorqui Jan 06 '16 at 19:13
2

There are three variations you can try without needing to escape your pattern:

This one tests literal strings. No regex instance is interpreted:

 $2 == expr

This one tests if a literal string is a subset:

 index($2, expr)

This one tests regex pattern:

 $2 ~ pattern
konsolebox
  • 72,135
  • 12
  • 99
  • 105