1

I need to execute a jq query which contains double quotes. I wrapped the query in single quotes, so the double quote characters should be interpreted as normal characters. Unfortunately, jq trims them. I don't understand how and why I should escape the double quote characters.

Example: I have the test.json file:

{
  "artifacts": [
    {
      "id": "foo",
      "name": "Foo",
      "version": "1.0",
      "licenses": [
        "GPL-1",
        "GPL-2"
      ]
    },
    {
      "id": "bar",
      "name": "Bar",
      "version": "3.0",
      "licenses": [
        "GPL-3",
        "Apache 2.0"
      ]
    },
    {
      "id": "ignored",
      "name": "Ignored",
      "version": "3.0",
      "licenses": [
        "Apache 2.0"
      ]
    }
  ]
}

I would like to list all artifacts (name and version) which have at least one GPL licence. The result should be sorted alphabeticaly by name. The query to handle it is as follows:

[.artifacts[] | select(.licenses[] | startswith("GPL-"))] | unique_by(.id) | sort_by(.name) | .[] | "\(.name) \(.version)"

Unfortunately, when I execute the command it fails:

> cat .\test.json | jq -r '[.artifacts[] | select(.licenses[] | startswith("GPL-"))] | unique_by(.id) | sort_by(.name) | .[] | "\(.name) \(.version)"'
jq: error: syntax error, unexpected ')' (Windows cmd shell quoting issues?) at <top-level>, line 1:
[.artifacts[] | select(.licenses[] | startswith(GPL-))] | unique_by(.id) | sort_by(.name) | .[] | \(.name)
jq: error: syntax error, unexpected INVALID_CHARACTER (Windows cmd shell quoting issues?) at <top-level>, line 1:
[.artifacts[] | select(.licenses[] | startswith(GPL-))] | unique_by(.id) | sort_by(.name) | .[] | \(.name)
jq: 2 compile errors

The error message shows that the double quote characters are missing. I tried many combinations and I finally found the correct configuration:

> cat .\test.json | jq -r '[.artifacts[] | select(.licenses[] | startswith(""GPL-""""))] | unique_by(.id) | sort_by(.name) | .[] | """\(.name) \(.version)""'
Bar 3.0
Foo 1.0

I don't understand why I should two, next four, next three and at the end two quotes.

The query works fine on Linux:

$ cat ./test.json | jq -r '[.artifacts[] | select(.licenses[] | startswith("GPL-"))] | uniq
ue_by(.id) | sort_by(.name) | .[] | "\(.name) \(.version)"'
Bar 3.0
Foo 1.0
mklement0
  • 382,024
  • 64
  • 607
  • 775
agabrys
  • 8,728
  • 3
  • 35
  • 73
  • 1
    The sad reality as of PowerShell 7.2.x is that an _extra, manual_ layer of ``\``-escaping of embedded `"` characters is required in arguments passed to _external programs_. This _may_ get fixed in a future version, which _may_ require opt-in. See [this answer](https://stackoverflow.com/a/66837948/45375) to the linked duplicate for details. – mklement0 Aug 01 '22 at 12:51
  • To make it explicit: `jq` isn't the culprit here - it is PowerShell and the way it passes arguments with embedded `"` chars. to _any_ external program, such as `jq`. – mklement0 Aug 01 '22 at 12:59
  • Or run the jq filter from a file to avoid these quoting issues: `cat file.json | jq -f jqfilter.txt` – js2010 Aug 01 '22 at 16:44

1 Answers1

1

The jq authors recommendation is to wrap queries in double quotes on Windows when cmd is used. Next the double quote characters should be escaped with baskslashes (read Invoking jq):

When using the Windows command shell (cmd.exe) it's best to use double quotes around your jq program when given on the command-line (instead of the -f program-file option), but then double-quotes in the jq program need backslash escaping.

I checked with single quotes & escaping double quotes with backslashes on Windows with PowerShell. It works:

> cat .\test.json | jq -r '[.artifacts[] | select(.licenses[] | startswith(\"GPL-\"))] | unique_by(.id) | sort_by(.name) | .[] | \"\(.name) \(.version)\"'
Bar 3.0
Foo 1.0

Double quotes could be used too, but then the double quote characters must be escaped twice, for:

  • jq: "\"
  • PowerShell: """

Required conversion: "\""

> cat .\test.json | jq -r "[.artifacts[] | select(.licenses[] | startswith(\""GPL-\""))] | unique_by(.id) | sort_by(.name) | .[] | \""\(.name) \(.version)\"""
Bar 3.0
Foo 1.0

As mklement0 described in this post the additional escaping is required due to a bug in PowerShell.

agabrys
  • 8,728
  • 3
  • 35
  • 73
  • Good summary, but it's worth pointing out that the only reason the extra ``\``-escaping is needed is due to a long-standing PowerShell bug. As an aside: `""` works perfectly fine inside `"..."`, but what you may see more frequently in practice is `\`"`, using PowerShell's general escape character, `\`` (the backtick). – mklement0 Aug 01 '22 at 12:53
  • The linked duplicate also shows how to automate the extra ``\``-escaping, which is convenient in general, and a necessity if your JSON string isn't a string literal you control. – mklement0 Aug 01 '22 at 12:55