260

Is there a (roughly) SQL or XQuery-like language for querying JSON?

I'm thinking of very small datasets that map nicely to JSON where it would be nice to easily answer queries such as "what are all the values of X where Y > 3" or to do the usual SUM / COUNT type operations.

As completely made-up example, something like this:

[{"x": 2, "y": 0}}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]

SUM(X) WHERE Y > 0     (would equate to 7)
LIST(X) WHERE Y > 0    (would equate to [3,4])

I'm thinking this would work both client-side and server-side with results being converted to the appropriate language-specific data structure (or perhaps kept as JSON)

A quick Googling suggests that people have thought about it and implemented a few things (JAQL), but it doesn't seem like a standard usage or set of libraries has emerged yet. While each function is fairly trivial to implement on its own, if someone has already done it right I don't want to re-invent the wheel.

Any suggestions?

Edit: This may indeed be a bad idea or JSON may be too generic a format for what I'm thinking.. The reason for wanting a query language instead of just doing the summing/etc functions directly as needed is that I hope to build the queries dynamically based on user-input. Kinda like the argument that "we don't need SQL, we can just write the functions we need". Eventually that either gets out of hand or you end up writing your own version of SQL as you push it further and further. (Okay, I know that is a bit of a silly argument, but you get the idea..)

serv-inc
  • 35,772
  • 9
  • 166
  • 188
allclaws
  • 5,575
  • 8
  • 30
  • 27
  • I have such a need too. I need to match incoming JSON requests by specific values at specific locations in the object tree. The query has actually to be configured by a (power) user. Current workaround is to build a make-shift XML out of JSON and apply XPath. – Vladimir Dyuzhev Apr 24 '11 at 16:52
  • 3
    It's more of a shell tool, but jq (https://stedolan.github.io/jq/) has been awesome for exploring json data. Try it out in the playground: https://jqplay.org/ – jtmoulia Feb 10 '15 at 02:48
  • There is a web-based tool that allows you to run SQL queries on public JSON feeds or APIs at [sqall.co](http://sqall.co). – Stack Man Aug 16 '15 at 21:37
  • **See also:** https://stackoverflow.com/questions/7681301/search-for-a-key-in-a-nested-python-dictionary https://stackoverflow.com/a/16508328/42223 – dreftymac Oct 30 '17 at 19:55
  • More ideas at https://stackoverflow.com/questions/1955505/parsing-json-with-unix-tools – wisbucky Dec 06 '17 at 23:55
  • Can somebody please compile a list of json query languages which is associate with platform or language? E.g. `JMespath` goes well `kubernetes`, `aws` uses `athena` while it may suit for `arm template`? – Rajesh Swarnkar Sep 07 '22 at 10:29

27 Answers27

112

EDIT Sept 2022:

JMESPath seems to be the most widely-used, fastest-growing, and best-reviewed of alternatives for this. It has many features, including "where"-style filters.

ORIGINAL:

Sure, how about:

They all seem to be a bit work in progress, but work to some degree. They are also similar to XPath and XQuery conceptually; even though XML and JSON have different conceptual models (hierarchic vs object/struct).

EDIT Sept 2015:

Actually there is now JSON Pointer standard that allows very simple and efficient traversal of JSON content. It is not only formally specified, but also supported by many JSON libraries. So I would call it actual real useful standard, although due to its limited expressiveness it may or may not be considered Query Language per se.

rattray
  • 5,174
  • 1
  • 33
  • 27
StaxMan
  • 113,358
  • 34
  • 211
  • 239
  • 89
    in other words, nothing standard and stable... :-( – Vladimir Dyuzhev Apr 24 '11 at 16:32
  • 2
    Talking about standard, I heard a rumor that XQuery 3.1 might be extended to support JSON queries (similar to [JSONiq](http://www.jsoniq.org/)). Of course, it could take some time since XQuery 3.0 is not officially released yet. – Julien Ribon Dec 14 '12 at 10:14
  • Oh mercy, I definitely hope not. All XML->JSON attempts I have seen have been horrible messes -- information models are incompatible. But I would like to see JQuery using same ideas, parts of syntax; just properly modified to JSON info model. – StaxMan Dec 14 '12 at 18:24
  • 1
    For anyone looking for a Ruby implementation of JSONPath: https://github.com/joshbuddy/jsonpath – Robert Ross Feb 24 '14 at 18:31
  • @GôTô: Using MongoDB, if you have that freedom, seems like a viable approach. (see the [answer below](http://stackoverflow.com/a/32607945/1587329) for an example of how to translate the query to the built-in shell) – serv-inc Sep 16 '15 at 12:02
51

Update: XQuery 3.1 can query either XML or JSON - or both together. And XPath 3.1 can too.

The list is growing:

Jonathan
  • 665
  • 5
  • 12
  • UNQL is dead domain and probably project. – Xdg Aug 11 '23 at 08:35
  • Possible to indicate popular companies using which query language? e.g. `JMESPath` used by Microsoft Azure's [azure-cli](https://learn.microsoft.com/en-us/cli/azure/query-azure-cli?tabs=concepts%2Cbash). `kubectl` goes good with `jq` and `jsonpath`. – Rajesh Swarnkar Sep 02 '23 at 12:25
50

I'd recommend my project I'm working on called jLinq. I'm looking for feedback so I'd be interested in hearing what you think.

If lets you write queries similar to how you would in LINQ...

var results = jLinq.from(records.users)

    //you can join records
    .join(records.locations, "location", "locationId", "id")

    //write queries on the data
    .startsWith("firstname", "j")
    .or("k") //automatically remembers field and command names

    //even query joined items
    .equals("location.state", "TX")

    //and even do custom selections
    .select(function(rec) {
        return {
            fullname : rec.firstname + " " + rec.lastname,
            city : rec.location.city,
            ageInTenYears : (rec.age + 10)
        };
    });

It's fully extensible too!

The documentation is still in progress, but you can still try it online.

dkretz
  • 37,399
  • 13
  • 80
  • 138
hugoware
  • 35,731
  • 24
  • 60
  • 70
  • @hugoware: is there any documentation for this. Are there any queries other than .starts() (such as contains?) – Rikki Sep 18 '12 at 15:52
  • 8
    Last update 8 years ago and no response to being asked if the project is dead 5 years ago... I think the project is dead. – cfc Oct 29 '18 at 14:53
  • 1
    Yes, seems to be dead. Link is now a 404. [JMESPath](http://jmespath.org/) seems to be quite robust, standardized, and cross-platform, although not sure if it can do joins like jLinq. – Simon East Jun 15 '21 at 02:29
20

JMESPath works really quite easy and well: http://jmespath.org/. It has a thorough specification, and libraries for multiple languages. It is being used by Amazon in the AWS command line interface, so it’s got to be quite stable.

Syntax Examples:

// Select a single item
people[1].firstName

// Select a slice of an array
people[0:5]

// Select all the first names
people[*].firstName

// Select all first names based on search term
people[?state=='VA'].firstName

// Count how many people are over 35
length(people[?age>`35`])

// Select only the name and age of people over 35
people[?age>`35`].{name: name, age: age}

// Join expressions together to sort and join elements into a string
people[?state == 'WA'].name | sort(@) | join(', ', @)

There are plenty more live examples you can play with in the docs.

Simon East
  • 55,742
  • 17
  • 139
  • 133
Frido Emans
  • 5,120
  • 2
  • 27
  • 42
  • 6
    Yet at the same time on the same page: "If you need more advanced features that may not be possible with --query, you can check out jq, a command line JSON processor." So it seems that AWS uses `jmespath` for the `--query` parameter, but recommends `jq` for command line piping. http://docs.aws.amazon.com/cli/latest/userguide/controlling-output.html – wisbucky Dec 07 '17 at 00:14
  • +1 for JMESPath - it's a very thorough and detailed specification, with libraries for multiple languages, and a relatively simple and intuitive syntax. – Simon East Jun 14 '21 at 08:44
19

jq is a JSON query language, mainly intended for the command-line but with bindings to a wide range of programming languages (Java, node.js, php, ...) and even available in the browser via jq-web. The C-based implementation of jq is generally known as "jq", and the Go-based version as "gojq".

Here are some illustrations based on the original question, which gave this JSON as an example:

 [{"x": 2, "y": 0}}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]

SUM(X) WHERE Y > 0 (would equate to 7)

map(select(.y > 0)) | add

LIST(X) WHERE Y > 0 (would equate to [3,4])

map(.y > 0)

jq syntax extends JSON syntax

Every JSON expression is a valid jq expression, and expressions such as [1, (1+1)] and {"a": (1+1)}` illustrate how jq extends JSON syntax.

A more useful example is the jq expression:

{a,b}

which, given the JSON value {"a":1, "b":2, "c": 3}, evaluates to {"a":1, "b":2}.

peak
  • 105,803
  • 17
  • 152
  • 177
10

The built-in array.filter() method makes most of these so-called javascript query libraries obsolete

You can put as many conditions inside the delegate as you can imagine: simple comparison, startsWith, etc. I haven't tested but you could probably nest filters too for querying inner collections.

forivall
  • 9,504
  • 2
  • 33
  • 58
Roger
  • 1,746
  • 17
  • 13
  • 6
    `array.filter()` is part of JavaScript, not JSON. – Iain Samuel McLean Elder Oct 28 '14 at 23:21
  • 2
    JSON is a subset of JavaScript, but there’s lots of languages that support both JSON and arrays and that have an array filter method implemented, so this is a valid point. – dakab Jun 22 '15 at 07:51
  • While your answer is correct, i.e. `filter()` can be used to "query" JSON objects there is one downside: it accepts function and not string. Therefore, you can't use this over the network to pass as a parameter. – iaforek Sep 15 '21 at 13:35
8

Another way to look at this would be to use mongoDB You can store your JSON in mongo and then query it via the mongodb query syntax.

forivall
  • 9,504
  • 2
  • 33
  • 58
Alexx Roche
  • 3,151
  • 1
  • 33
  • 39
7

ObjectPath is simple and ligthweigth query language for JSON documents of complex or unknown structure. It's similar to XPath or JSONPath, but much more powerful thanks to embedded arithmetic calculations, comparison mechanisms and built-in functions.

Example

Python version is mature and used in production. JS is still in beta.

Probably in the near future we will provide a full-fledged Javascript version. We also want to develop it further, so that it could serve as a simpler alternative to Mongo queries.

Ela Bednarek
  • 171
  • 2
  • 3
  • 1
    Except that it has hardly any documentation so it's difficult how to find out how to do anything like find elements with text like something. – James O'Brien Oct 11 '14 at 01:13
  • 1
    @JamesO'Brien Thanks for your remark - if you find the [reference](http://adriank.github.io/ObjectPath/reference.html) useless and have any specific problem in mind, let us know [here](https://groups.google.com/forum/#!forum/objectpath) - somebody will try to help. We're currently working on making the docs more usable, I'd love your comments. – Ela Bednarek Oct 13 '14 at 15:09
  • Thanks - I appreciate it. I would like to use. Currently I'm using http://ashphy.com/JSONPathOnlineEvaluator/? – James O'Brien Oct 14 '14 at 15:53
  • Couldn't figure out how to use this with Javascript because of a total lack of documentation. – user3670743 Apr 21 '16 at 22:10
  • We're looking for contributors to help with that. You can write on Github or google groups https://groups.google.com/forum/#!members/objectpath what you're trying to achieve, and I'm sure somebody will answer your Qs. – Ela Bednarek Apr 21 '16 at 22:39
7

If you are using .NET then Json.NET supports LINQ queries over the top of JSON. This post has some examples. It supports filtering, mapping, grouping, etc.

James Newton-King
  • 48,174
  • 24
  • 109
  • 130
4

if you want to use pure javascript try this:

var object = { result: { data: { point1: "x", value: 2 }, foo: { bar: 7 } } },
    path = 'result.data.value',
    getValue = (o, p) => p.split('.').reduce((r, k) => r[k], o);

console.log(getValue(object, path));
4

OK, this post is a little old, but... if you want to do SQL-like query in native JSON (or JS objects) on JS objects, take a look at https://github.com/deitch/searchjs

It is both a jsql language written entirely in JSON, and a reference implementation. You can say, "I want to find all object in an array that have name==="John" && age===25 as:

{name:"John",age:25,_join:"AND"}

The reference implementation searchjs works in the browser as well as as a node npm package

npm install searchjs

It can also do things like complex joins and negation (NOT). It natively ignores case.

It doesn't yet do summation or count, but it is probably easier to do those outside.

deitch
  • 14,019
  • 14
  • 68
  • 96
3

Here's some simple javascript libraries that will also do the trick:

  • Dollar Q is a nice lightweight library. It has a familiar feel to the chaining syntax made popular by jQuery and is only 373 SLOC.
  • SpahQL is a fully featured query language with a syntax similar to XPath (Homepage, Github
  • jFunk is an in progress query language, with a syntax similar to CSS/jQuery selectors. It looked promising, but hasn't had any development beyond its in initial commit.

  • (added 2014): the jq command line tool has a neat syntax, but unfortunately it is a c library. Example usage:

    < package.json jq '.dependencies | to_entries | .[] | select(.value | startswith("git")) | .key'

forivall
  • 9,504
  • 2
  • 33
  • 58
3

In MongoDB, this is how it would work (in the mongo shell, there exist drivers for a language of your choice).

db.collection.insert({"x": 2, "y": 0}); // notice the ':' instead of ','
db.collection.insert({"x": 3, "y": 1});
db.collection.insert({"x": 4, "y": 1});

db.collection.aggregate([{$match: {"y": {$gt: 0}}}, 
                         {$group: {_id: "sum", sum: {$sum: "$x"}}}]);
db.collection.aggregate([{$match: {"y": {$gt: 0}}}, 
                         {$group: {_id: "list", list: {$push: "$x"}}}]);

The first three commands insert the data into your collection. (Just start the mongod server and connect with the mongo client.)

The next two process the data. $match filters, $group applies the sum and list, respectively.

serv-inc
  • 35,772
  • 9
  • 166
  • 188
3

Check out https://github.com/niclasko/Cypher.js (note: I'm the author)

It's a zero-dependency Javascript implementation of the Cypher graph database query language along with a graph database. It runs in the browser (tested with Firefox, Chrome, IE).

With relevance to the question. It can be used to query JSON endpoints:

load json from "http://url/endpoint" as l return l limit 10

Here's an example of querying a complex JSON document and performing analysis on it:

Cypher.js JSON query example

3

You could use linq.js.

This allows to use aggregations and selectings from a data set of objects, as other structures data.

var data = [{ x: 2, y: 0 }, { x: 3, y: 1 }, { x: 4, y: 1 }];

// SUM(X) WHERE Y > 0     -> 7
console.log(Enumerable.From(data).Where("$.y > 0").Sum("$.x"));

// LIST(X) WHERE Y > 0    -> [3, 4]
console.log(Enumerable.From(data).Where("$.y > 0").Select("$.x").ToArray());
<script src="https://cdnjs.cloudflare.com/ajax/libs/linq.js/2.2.0.2/linq.js"></script>
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
2

SpahQL is the most promising and well thought out of these, as far as I can tell. I highly recommend checking it out.

David
  • 606
  • 6
  • 11
2


I've just finished a releaseable version of a clientside JS-lib (defiant.js) that does what you're looking for. With defiant.js, you can query a JSON structure with the XPath expressions you're familiar with (no new syntax expressions as in JSONPath).

Example of how it works (see it in browser here http://defiantjs.com/defiant.js/demo/sum.avg.htm):

var data = [
       { "x": 2, "y": 0 },
       { "x": 3, "y": 1 },
       { "x": 4, "y": 1 },
       { "x": 2, "y": 1 }
    ],
    res = JSON.search( data, '//*[ y > 0 ]' );

console.log( res.sum('x') );
// 9
console.log( res.avg('x') );
// 3
console.log( res.min('x') );
// 2
console.log( res.max('x') );
// 4

As you can see, DefiantJS extends the global object JSON with a search function and the returned array is delivered with aggregate functions. DefiantJS contains a few other functionalities but those are out of the scope for this subject. Anywho, you can test the lib with a clientside XPath Evaluator. I think people not familiar with XPath will find this evaluator useful.
http://defiantjs.com/#xpath_evaluator

More information about defiant.js
http://defiantjs.com/
https://github.com/hbi99/defiant.js

I hope you find it useful... Regards

Hakan Bilgin
  • 1,113
  • 12
  • 11
2
  1. Google has a project called lovefield; just found out about it, and it looks interesting, though it is more involved than just dropping in underscore or lodash.

    https://github.com/google/lovefield

Lovefield is a relational query engine written in pure JavaScript. It also provides help with persisting data on the browser side, e.g. using IndexedDB to store data locally. It provides SQL-like syntax and works cross-browser (currently supporting Chrome 37+, Firefox 31+, IE 10+, and Safari 5.1+...


  1. Another interesting recent entry in this space called jinqJs.

    http://www.jinqjs.com/

    Briefly reviewing the examples, it looks promising, and the API document appears to be well written.


function isChild(row) {
  return (row.Age < 18 ? 'Yes' : 'No');
}

var people = [
  {Name: 'Jane', Age: 20, Location: 'Smithtown'},
  {Name: 'Ken', Age: 57, Location: 'Islip'},
  {Name: 'Tom', Age: 10, Location: 'Islip'}
];

var result = new jinqJs()
  .from(people)
  .orderBy('Age')
  .select([{field: 'Name'}, 
     {field: 'Age', text: 'Your Age'}, 
     {text: 'Is Child', value: isChild}]);

jinqJs is a small, simple, lightweight and extensible javaScript library that has no dependencies. jinqJs provides a simple way to perform SQL like queries on javaScript arrays, collections and web services that return a JSON response. jinqJs is similar to Microsoft's Lambda expression for .Net, and it provides similar capabilities to query collections using a SQL like syntax and predicate functionality. jinqJs’s purpose is to provide a SQL like experience to programmers familiar with LINQ queries.

mg1075
  • 17,985
  • 8
  • 59
  • 100
2

PythonQL offers an embedded syntax that IMHO is an improvement on SQL, principally because group, window, where, let, etc. can be freely intermixed.

$ cat x.py
#coding: pythonql
data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}]
q = [x match {'x': as x, 'y': as y} in data where y > 0]
print(sum(q))
print(list(q))

q = [x match {'x': as x, 'y': as y} as d in data where d['y'] > 0]
print(sum(q))

This code shows two different answers to your question, depending on your need to handle the entire structure or just the value. Execution gives you the expected result.

$ python x.py
7
[3, 4]
7
Anthony Tomasic
  • 183
  • 1
  • 6
1

You can also use Underscore.js which is basically a swiss-knife library to manipulate collections. Using _.filter, _.pluck, _.reduce you can do SQL-like queries.

var data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}];

var posData = _.filter(data, function(elt) { return elt.y > 0; });
// [{"x": 3, "y": 1}, {"x": 4, "y": 1}]

var values = _.pluck(posData, "x");
// [3, 4]

var sum = _.reduce(values, function(a, b) { return a+b; });
// 7

Underscore.js works both client-side and server-side and is a notable library.

You can also use Lo-Dash which is a fork of Underscore.js with better performances.

Michel Hua
  • 1,614
  • 2
  • 23
  • 44
1

Whenever possible I would shift all of the querying to the backend on the server (to the SQL DB or other native database type). Reason being is that it will be quicker and more optimized to do the querying.

I know that jSON can be stand alone and there may be +/- for having a querying language but I cannot see the advantage if you are retrieving data from the backend to a browser, as most of the JSON use cases. Query and filter at the backend to get as small a data that is needed.

If for whatever reason you need to query at the front-end (mostly in a browser) then I would suggest just using array.filter (why invent something else?).

That said what I think would be more useful is a transformation API for json...they are more useful since once you have the data you may want to display it in a number of ways. However, again, you can do much of this on the server (which can be much easier to scale) than on the client - IF you are using server<-->client model.

Just my 2 pence worth!

1

I'll second the notion of just using your own javascript, but for something a bit more sophisticated you might look at dojo data. Haven't used it but it looks like it gives you roughly the kind of query interface you're looking for.

1

The current Jaql implementation targets large data processing using a Hadoop cluster, so it might be more than you need. However, it runs easily without a Hadoop cluster (but still requires the Hadoop code and its dependencies to get compiled, which are mostly included). A small implementation of Jaql that could be embedded in Javascript and the a browser would be a great addition to the project.

Your examples above are easily written in jaql:

$data = [{"x": 2, "y": 0}, {"x": 3, "y": 1}, {"x": 4, "y": 1}];

$data -> filter $.y > 0 -> transform $.x -> sum(); // 7

$data -> filter $.y > 0 -> transform $.x; // [3,4]

Of course, there's much more too. For example:

// Compute multiple aggregates and change nesting structure:
$data -> group by $y = $.y into { $y, s:sum($[*].x), n:count($), xs:$[*].x}; 
    // [{ "y": 0, "s": 2, "n": 1, "xs": [2]   },
    //  { "y": 1, "s": 7, "n": 2, "xs": [3,4] }]

// Join multiple data sets:
$more = [{ "y": 0, "z": 5 }, { "y": 1, "z": 6 }];
join $data, $more where $data.y == $more.y into {$data, $more};
    // [{ "data": { "x": 2, "y": 0 }, "more": { "y": 0, "z": 5 }},
    //  { "data": { "x": 3, "y": 1 }, "more": { "y": 1, "z": 6 }},
    //  { "data": { "x": 4, "y": 1 }, "more": { "y": 1, "z": 6 }}]

Jaql can be downloaded/discussed at http://code.google.com/p/jaql/

0

If you are on python, There is Mini open source version of MongoDB, MontyDB https://github.com/davidlatwe/montydb which has over 500+ stars on github, and it is supported by JetBrains.

from montydb.utils import MontyList
response = [
    {'namespace': 'dash_mantine_components',
     'props': {'checked': True,
               'id': {'index': 0, 'type': 'checkbox'},
               'label': 'My first to do'},
     'type': 'Checkbox'},
    {'namespace': 'dash_mantine_components',
     'props': {'checked': True,
               'id': {'index': 1, 'type': 'checkbox'},
               'label': 'My Another to do'},
     'type': 'Input'}, 
    {'namespace': 'dash_mantine_components',
     'props': {'checked': False,
               'id': {'index': 2, 'type': 'checkbox'},
               'label': 'My next level to do'},
     'type': 'Div'}, 
]

cli = MontyList(response)
cli.find({'props.checked': True},{'type':1})

Output:

MontyList([{'type': 'Checkbox'}, {'type': 'Input'}])

I like this over other jsonquery options, as this library gives the same feel of MongoDB.

0

I am the developer of a new query language for JSON, called ~Q (pronounced "unquery"). It was designed to address many of the shortcomings of other JSON query languages, and balance ease of use and expressive power. As the main developer, I am biased, of course, but you can check it out and decide for yourself:

https://github.com/xcite-db/Unquery

As for the OP's examples, the queries in ~Q would be:

{
    "example1:[]" : "$sum(x)?y>0",
    "example2:[]" : ["x?y>0"]
}
Sela M.
  • 21
  • 2
0

I use SQLite for that: https://sqlite.org/json1.html

It's nice because you can use the actual SQL language and SQLite is very fast.

First I create a temp table:

create temp table data as select value from json_each(readfile('data.json'))

Then use the SQLite JSON functions:

select value->'$.foo' foo, count(value->'$.bar') nbar from data group by foo 
0

JSONata (https://jsonata.org/) is another option. Your examples would translate to:

// input:
[
  {"x": 2, "y": 0}, 
  {"x": 3, "y": 1}, 
  {"x": 4, "y": 1}
]

// LIST(X) WHERE Y > 0     yields [3, 4]
$[y>0].x

// SUM(X) WHERE Y > 0      yields 7
$sum($[y>0].x)
aeberhart
  • 744
  • 1
  • 4
  • 15