22

I need to work with some systems that use JMESPath to search JSON. How can I search for strings with pattern (like this). How do I do this with a regular expression in case-insensitive mode?

P.S.: Not sure why AWS S3 CLI, and Ansible use JMESPath instead of jq to query JSON. It seems to be missing these features and the proposal to add split function has been frozen since 2017 (like this and this). These features are all available to jq. What are the strengths of JMESPath that make it appealing?

Mig82
  • 4,856
  • 4
  • 40
  • 63
HKTonyLee
  • 3,111
  • 23
  • 34

1 Answers1

15

It's not so much about the difference between JMESPath and jq as the different ways they are used.

Suppose you are querying a remote resource, the result is going to number in the millions of records, but you only care about a specific, much smaller subset of the records. You have two choices:

  1. Have every record transmitted to you over the network, then pick out the ones you want locally
  2. Send your filter to the remote resource, and have it do the filtering, only sending you the response.

jq is typically used for the former, JMESPath for the latter. There's no reason why the remote service couldn't accept a jq filter, or that you couldn't use a JMESPath-based executable.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • 1
    Thanks for your answer! This is getting interesting. I guess because jq is a Turing-complete language so it is less suitable to run arbitrary jq language in server-side. I am not familar with JMESPath but let me check if it is Turing-complete. – HKTonyLee Jun 18 '21 at 20:35
  • 2
    I should stress that my answer is based solely on how I see each *used*, and not so much on what either would be *suitable* for. Perhaps JMESPath is optimized to be simpler to write, but less powerful. (Or maybe JMESPath was more powerful when introduced, and so made people choose it as their query language, but `jq` caught up in the meantime.) – chepner Jun 18 '21 at 20:37
  • 7
    Or JMESPath puts an emphasis on *filtering*, while `jq` emphasizes *transformation*. – chepner Jun 18 '21 at 20:38
  • I searched some of the links in the Internet (e.g. https://github.com/serverlessworkflow/specification/issues/216, https://forum.snapcraft.io/t/jmespath-in-the-snap-tooling-we-need-your-help/4108/2) they all mentioned JMESPath is lacking the features they needed. But no one said jq does not have features they needed. The only complain to jq is that it lacks a well-defined spec. I would assume JMESPath is less powerful than jq. – HKTonyLee Jun 18 '21 at 20:43
  • I agree your "JMESPath puts an emphasis on filtering, while jq emphasizes transformation" is a good summary. – HKTonyLee Jun 18 '21 at 20:44
  • 9
    I suspect the reason has to do with the fact that it is trivially easy to write jq programs that will consume as much CPU and/or RAM as is available. Consider e.g. `range(0;infinite)` or `[range(0;infinite)]` – peak Jun 18 '21 at 22:30
  • Thanks @peak! That is a very good example. Attackers can easily DDOS servers that accept `jq` – HKTonyLee Jun 22 '21 at 04:22
  • Fortunately, jqplay.org has an execution timeout. – peak Jun 22 '21 at 05:47
  • I am afraid this is not enough. That needs ulimit to restrict the memory usage as well. That means `jq` must be run in separated process. – HKTonyLee Jun 22 '21 at 18:18
  • Sending a `jq` filter to the server is a no-go because it is Turing Complete. Downvoted because you confidently say "There's no reason why ..." and won't fix it even though the first comment pointed out this issue. – user2297550 Jan 03 '22 at 16:15
  • 2
    I'm referring to *techincal* reasons. If the server wants to accept the risk of processing a non-terminating `jq` filter, it's free to do so. – chepner Jan 03 '22 at 16:20