8

I have this awk statement:

glb_library="my_library"
awk "
        /^Direct Dependers of/ { next }
        /^---/                 { next }
        /^$glb_library:/       { ver=\$0; next }
                               { gsub(/[[:space:]]/, '', \$0); print ver':'\$0 }
      " file

Basically, I have enclosed the awk code in double quotes so that the shell variable glb_library is expanded. I have made sure to escape the $ character to prevent the shell from expanding $0. Followed the guidance from here.

awk gives me this error:

awk: syntax error at source line 5
 context is
                                   { gsub(/[[:space:]]/, >>>  ' <<<

I want to understand:

  • Is it legal to use single quotes inside awk? Why is '' not a null string like "" is?
  • Does awk treat single and double quotes differently?

My code worked after I escaped the single quotes with backslashes and used \"\" to represent the null string instead of ''.

learningbee
  • 333
  • 1
  • 5
  • 11
  • Do you know what the `-v` flag is for? – 123 Jun 08 '17 at 21:42
  • Yes, for passing variables to awk. – learningbee Jun 08 '17 at 21:43
  • So why are you using it? – 123 Jun 08 '17 at 21:43
  • Sorry, that was a typo. Edited the question. – learningbee Jun 08 '17 at 21:44
  • 3
    `awk` does not recognize single quotes. – William Pursell Jun 08 '17 at 21:45
  • 2
    You should use it and set `glb_library` as an awk variable to be used in the script, then you can put the whole thing in single quotes. – 123 Jun 08 '17 at 21:45
  • 1
    @WilliamPursell Do you have any documentation for that, like I know it doesn't but I've never seen it officially mentioned anywhere – 123 Jun 08 '17 at 22:21
  • 1
    This [document](https://www.gnu.org/software/gawk/manual/html_node/Quoting.html) talks about `''` being the null string in `awk`, but doesn't say `awk` doesn't recognize single quotes. – codeforester Jun 08 '17 at 22:26
  • **single quote is not special within double quotes** and treated like any other non-special character – karakfa Jun 08 '17 at 23:05
  • @WilliamPursell that's not true, awk recognizes single quotes just fine. – Ed Morton Jun 08 '17 at 23:13
  • @123 awk recognizes single quotes just fine, the confusion you're having is that **the shell** does not allow single quotes within a single-quote delimited script. That applies whether the script is awk, sed, perl or anything else. If you for some reason **need** to have an explicit single quote within an awk script then just store it in a file and execute it as `awk -f script` so the shell rule doesn't get in the way and awk will have no trouble with the single quotes but in general use the octal escape sequence `\047` anywhere you need a single quote and it'll work from shell or in a file. – Ed Morton Jun 08 '17 at 23:17
  • 4
    @EdMorton The POSIX specification for `awk` only specifies double quotes for string literals, and something as simple as `awk "END {print 'foo'}" < /dev/null` fails with both GNU `awk` 4.0.2 and BSD `awk` 20070501 (that ships with Mac OS X). What version of `awk` are you using that supports single-quoted strings? – chepner Jun 08 '17 at 23:48
  • @EdMorton That isn't the confusion I am having as I was already using a script to eliminate any possible shell influence. I know that you can use literal `'` in strings/regex, but in all awk versions I have ever used you cannot use single quotes to quote strings. If you run the script in you get `awk: script.awk:3: ^ invalid char ''' in expression`, and also the man page states `It is written in awk programs like this: "". In the shell, it can be written using single or double quotes: "" or ''` which would imply single quotes cannot be used. – 123 Jun 09 '17 at 11:23
  • @chepner I didn't say that `'` was a string delimiter in awk, it's not. The string delimiter in awk is `"`. What I said is that the statement `awk does not recognize single quotes` is not true - awk recognizes single quotes just fine. – Ed Morton Jun 09 '17 at 11:36
  • 1
    @123 right `'` is not the string delimiter in awk, `"` is. The comment William made and you replied to was `awk does not recognize single quotes` which isn't true and is what my comments are also in response to. It never occurred to me he might just mean `... as a string delimiter` and idk if that IS what he meant or if, as I thought, he (and subsequently you) were referring to the more general and frequently misunderstood problem of trying to use single quotes in a single-quote delimited script. – Ed Morton Jun 09 '17 at 11:58
  • 1
    I meant that awk does not recognize single quotes as a string delimiter, but I think it actually stands as a factually correct statement. Other than inside a double quoted string, a single quote character is, as far as I know, always a syntax error. If it is inside a double quoted string, I think it is correct to say that awk is not recognizing it, as it really doesn't "recognize" anything inside a string in any meaningful sense. Are there any examples where a single quote can be used outside of a string literal? – William Pursell Jun 10 '17 at 04:09

3 Answers3

9

Based on the comments above by awk experts and some research, I am posting this answer:

  • awk strings are enclosed in double quotes, not single quotes; more precisely: single quotes are not string delimiters in awk, unlike shell
  • awk attaches no special meaning to single quotes and they need to be enclosed in double quotes if used in string literals
  • it is best to use single quotes to wrap awk statements on command line, unlike OP's code that's using double quotes (Ed pointed this out clearly)

Further clarification:

  • "" is the null string in awk, not ''
  • to use single quotes in an awk string literal, enclose them in double quotes, as in "Ed's answers are great!"
  • other techniques followed while handling single quotes in awk are:

    a) use a variable, as in awk -v q="'" '{ print q }' ...

    b) use octal or hex notation, as in awk '{ print "\047"$0"\047" }' ...


Relevant documentation here.

codeforester
  • 39,467
  • 16
  • 112
  • 140
  • "awk treats single quotes as special characters" is misleading: it is precisely the fact that `'` has _no_ special meaning that precludes its use as a string delimiter. Demonstrating the use of `'` inside an _Awk_ double-quoted string is somewhat of a moot point, given that literal use of `'` is _impossible_ if the script is enclosed in `'...'` _as a whole_ - which is the only sensible choice. – mklement0 Jun 11 '17 at 22:35
  • 2
    I didn't use the term "special character" to mean "special meaning". Just reworded the answer. – codeforester Jun 11 '17 at 22:53
7

Never enclose any script in double quotes or you're sentencing yourself to backslash-hell. This is the syntax for what you're trying to do:

glb_library="my_library"
awk -v glb_library="$glb_library" '
        /^Direct Dependers of/ { next }
        /^---/                 { next }
        $0 ~ "^"glb_library":" { ver=$0; next }
                               { gsub(/[[:space:]]/, ""); print ver":"$0 }
      ' file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
7

A pragmatic summary:

  • As Ed Morton's helpful answer sensibly recommends:
    Always use single quotes to enclose your awk script as a whole ('...'), which ensures that there's no confusion over what the shell interprets up front, and what awk ends up seeing.

  • To define strings inside an awk script, always use double quotes ("...").

    • " is the only string delimiter awk recognizes.
    • "..." strings are non-interpolating (you cannot embed variable references), but they do recognize control-character sequences such as \n and \t.
  • A single quote (') has no syntactic meaning inside an awk script, but, - if you're using '...' for your overall script, as recommended - you cannot use a literal ' inside of it anyway, because the shell's single-quoted strings do not permit embedded ' chars.

    • If you do need to use a literal single quote (') in your awk script, you have three choices:
      • Pass a variable that defines it, and use awk's string concatenation, based on directly adjoining string literals and variable references:
        awk -v q=\' 'BEGIN { print "I" q "m good." }' # -> I'm good
      • Use an escape sequence inside "..."; for maximum portability and disambiguation, use an octal escape sequence (\047), not a hex one (\x27):
        awk 'BEGIN { print "I\047m good." }' # -> I'm good
      • Use '\'' (sic) to "escape" embedded ' chars. (technically, 3 distinct single-quoted shell string literals are being concatenated)Thanks, snr:
        awk 'BEGIN { print "I'\''m good" }' # -> I'm good
mklement0
  • 382,024
  • 64
  • 607
  • 775