2

I have a basic REST service backed by a PostgreSQL database with a table with various columns, one of which is a JSONB column that contains arbitrary data. Clients can store data filling in the fixed columns and provide any JSON as opaque data that is stored in the JSONB column.

I want to allow the client to query the database with constraints on both the fixed columns and the JSONB. It is easy to translate some query parameters like ?field=value and convert that into a parameterized SQL query for the fixed columns, but I want to add an arbitrary JSONB query to the SQL as well.

This JSONB query string could contain SQL injection, how can I prevent this? I think that because the structure of the JSONB data is arbitrary I can't use a parameterized query for this purpose. All the documentation I can find suggests I use parameterized queries, and I can't find any useful information on how to actually sanitize the query string itself, which seems like my only option.

For example a similar question is: How to prevent SQL Injection in PostgreSQL JSON/JSONB field?

But I can't apply the same solution as I don't know the structure of the JSONB or the query, I can't assume the client wants to query a particular path using a particular operator, the entire JSONB query needs to be freely provided by the client.

I'm using golang, in case there are any existing libraries or code fragments that I can use.

edit: some example queries on the JSONB that the client might do:

(content->>'company') is NULL
(content->>'income')::numeric>80000
content->'company'->>'name'='EA' AND (content->>'income')::numeric>80000
content->'assets'@>'[{"kind":"car"}]'
(content->>'DOB')::TIMESTAMP<'2000-01-30T10:12:18.120Z'::TIMESTAMP
EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)

Note that these don't cover all possible types of queries. Ideally I want any query that PostgreSQL supports on the JSONB data to be allowed. I just want to check the query to ensure it doesn't contain sql injection. For example, a simplistic and probably inadequate solution would be to not allow any ";" in the query string.

Shahin
  • 53
  • 6
  • What do you mean by "*JSONB query string*"? – Bergi Aug 06 '20 at 01:13
  • In general, you just pass the JSON value as a JSON text *string* to the database and then cast it to the expected type like any other literal value, and you can use parameterised queries for that just fine: `$1::jsonb`. – Bergi Aug 06 '20 at 01:15
  • @Bergi I mean PostgreSQL JSONB queries like this: https://kb.objectrocket.com/postgresql/how-to-query-a-postgres-jsonb-column-1433 For example if you had a "name" column of type string and a "details" column of type "JSONB" you could do a combined query like ```SELECT * FROM table WHERE name = "John" AND details->'physical'->'eye'->>'color' = 'blue';``` – Shahin Aug 06 '20 at 02:42
  • 1
    Can you please **[edit]** your question and provide us an example of that "json query string". Maybe you are simply looking for `where name @> $1` where `$1` is that "json query string". –  Aug 06 '20 at 05:54
  • @Shahin So? Even `WHERE details #>> $1 = $2` works as a parameterised query. – Bergi Aug 06 '20 at 09:40
  • Also have a look at [jsonpath expressions](https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-PATH) – Bergi Aug 06 '20 at 09:45
  • @Bergi Right, that is similar, or the same, as the JSONB queries I am talking about. But how does that help prevent SQL injection? These still need to be concatenated with the rest of the SELECT statement, right? – Shahin Aug 06 '20 at 21:36
  • @Shahin No, you don't concatenate SQL strings to form statements. Jsonpaths are values, they can be passed with parameterised queries: `WHERE jsonb_path_match(details, $1, $2)`. Alternatively you can escape them like any other literal. – Bergi Aug 06 '20 at 21:40
  • @Bergi can you show me how the client would construct/perform a range query like ```EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)``` using Jsonpaths? – Shahin Aug 06 '20 at 22:30
  • @Shahin If I read the docs right, `content @@ '$.assets[*].value > 10000'` should do it. Or the same with `jsonb_path_match(content, …)`. – Bergi Aug 06 '20 at 22:48
  • @Bergi sorry, I didn't get around to test this until today. It seems like using jsonb_path_match will result in postgres extracting all jsonb documents to test them and won't take advantage of GIN indexes like jsonb queries do: https://www.postgresql.org/docs/9.4/datatype-json.html#:~:text=jsonb%20Indexing,performance%20and%20flexibility%20trade%2Doffs. – Shahin Aug 17 '20 at 23:20
  • @Bergi Also I can't get the query syntax to work. For JSONB I can do something simple like this `SELECT * FROM "mytable" WHERE (content->'company'->>'name'='EA');` I expect the equivalent to be `SELECT * FROM "mytable" WHERE jsonb_path_match(content, '$.company.name ? (@ == $v1)', '{"v1": "EA"}');` but this doesn't work, I get "No function matches the given name and argument types", I assume the args are invalid? – Shahin Aug 17 '20 at 23:22
  • @Shahin since json path expressions are only available since v12, you will need to refer to [the current docs](https://www.postgresql.org/docs/12/datatype-json.html#JSON-INDEXING) – Bergi Aug 18 '20 at 05:10
  • @Bergi Yes, I was using postgresql 11. I upgraded to 12 and jsonb_path_match seems to work as expected. I can do all the queries needed, for example select all people with a car that has a value under 25000: `SELECT * FROM "mytable" WHERE jsonb_path_match(content, 'exists($.assets[*] ? (@.kind == "car" && @.value < 25000))');` I'm not sure if/when the GIN index is being used but that could be a separate issue. If you add your comment as an answer I can accept it as the resolution, as it allows parameterized jsonb queries. Thanks! – Shahin Aug 19 '20 at 23:15

1 Answers1

1

You could allow the users to specify a path within the JSON document, and then parameterize that path within a call to a function like json_extract_path_text. That is, the WHERE clause would look like:

WHERE json_extract_path_text(data, $1) = $2

The path argument is just a string, easily parameterized, which describes the keys to traverse down to the given value, e.g. 'foo.bars[0].name'. The right-hand side of the clause would be parameterized along the same rules as you're using for fixed column filtering.

Robert Nubel
  • 7,104
  • 1
  • 18
  • 30
  • OK, but isn't this limited to simple partial queries of the form: ```some field = some value```? How would you do more complicated queries like: ```(some field) is NULL```; ```(some field)::numeric>80000```; ```(some field)@>'some arbitrary JSON'```; Or really complicated ones like: ```EXISTS (SELECT FROM jsonb_array_elements(field X that is an array) X WHERE (X->>'some field')::numeric > 100000)```; – Shahin Aug 06 '20 at 02:54
  • @Shahin Do you also allow these more complicated queries for fixed (non-json) columns from your REST api? How do you represent them with `?field=value` parameters? – Bergi Aug 06 '20 at 09:43
  • @Bergi no I don't need to allow more complicated queries on the fixed (non-json) columns because in my case they are all in a flat table. I dont have one-to-many relationships. – Shahin Aug 06 '20 at 21:19
  • @Shahin I guess then jsonpath is the tool for you. – Bergi Aug 06 '20 at 21:21