9

This workaround not works

CREATE FUNCTION json_array_castext(json) RETURNS text[] AS $f$
  SELECT array_agg(x::text) FROM json_array_elements($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

-- Problem:
SELECT 'hello'='hello';  -- true...
SELECT  (json_array_castext('["hello","world"]'))[1] = 'hello'; -- false!

So, how to obtain real array of text?

PS: with the supposed "first class citizen" JSONb, the same problem.


Edit: after @OtoShavadze good answer (the comment solved!), a manifest for PostgreSQL developers: Why x::text is not a cast? (using pg 9.5.6) and why it not generates an warning or an error?

Peter Krauss
  • 13,174
  • 24
  • 167
  • 304

7 Answers7

8

try json_array_elements_text instead of json_array_elements, and you don't need explicit casting to text (x::text), so you can use:

CREATE or replace FUNCTION json_array_castext(json) RETURNS text[] AS $f$
    SELECT array_agg(x) FROM json_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

For your additional question

Why x::text is not a cast?

This is cast and because of this, its not giving any error, but when casting json string to text like this: ::text, postgres adds quotes to value.

Just for testing purposes, lets change your function to original again (as it is in your question) and try:

SELECT  
(json_array_castext('["hello","world"]'))[1] = 'hello',
(json_array_castext('["hello","world"]'))[1],
'hello'

As you see, (json_array_castext('["hello","world"]'))[1] gives "hello" instead of hello. and this was why you got false when comparing those values.

Oto Shavadze
  • 40,603
  • 55
  • 152
  • 236
  • Thanks Oto! It is not a critique, please ignore, is only a complement of my manifest :-) *In SQL, casting `char(N)` produces expected `text` with no quotations; in an embedded language or driver, casting SQL-`text` produces expected `string` with no quotations, casting `string` produces expected SQL-`text` with no quotations... It is the universal expected behaviour...* – Peter Krauss Jul 22 '17 at 00:33
  • For readers: this is the best answer (didactic and performance), and was the *accepted* one until 2020. The [new *accepted*](https://stackoverflow.com/a/64628819/287948) is the same solution with a little augmentation for nulls, so it is the best for your "library solution". – Peter Krauss Nov 01 '20 at 11:57
2

For this ugly behaviour of PostgreSQL, there are an ugly cast workaround, the operator #>>'{}':

CREATE or replace FUNCTION json_array_castext(json) RETURNS text[] AS $f$
    SELECT array_agg(x#>>'{}') FROM json_array_elements($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

SELECT  (json_array_castext('["hello","world"]'))[1] = 'hello'; -- true!

(edit) Year 2020, pg v12 performance check

We expect that specialized function json_array_elements_text() is better tham user-defined casting... But, how much better? 2 times? 20 times... or only a few percent?
And sometmes we can't use it, so, there are some loss of performance?

Preparing the test:

CREATE TABLE j_array_test AS -- JSON
  SELECT   array_to_json(array[x,10000,2222222,33333333,99999,y]) AS j
  FROM generate_series(1, 1900) t1(x), generate_series(1, 1900) t2(y);

CREATE TABLE jb_array_test AS --JSONb
  SELECT   to_jsonb(array[x,10000,2222222,33333333,99999,y]) AS j
  FROM generate_series(1, 1900) t1(x), generate_series(1, 1900) t2(y);

CREATE FUNCTION ...

Function names:

  • j_op_cast(json) use array_agg(x#>>'{}') FROM json_array_elements($1)
  • jb_op_cast(jsonb) use array_agg(x#>>'{}') FROM jsonb_array_elements($1)
  • j_func_cast(json) use array_agg(x) FROM json_array_elements_text($1)
  • jb_func_cast(jsonb) use array_agg(x) FROM jsonb_array_elements_text($1)

RESULTS: All results are near the same, the reported differences are perceptible only after some billions (~3610000) of function calls. For few thousands of calls they are equal-perfornance (!).

EXPLAIN ANALYZE select j_op_cast(j) from j_array_test; -- ~35000
EXPLAIN ANALYZE select j_func_cast(j) from j_array_test;  -- ~28000
-- Conclusion: about average time json_array_elements_text is ~22%  faster.
-- calculated as 200*(35000.-28000)/(28000+35000)

EXPLAIN ANALYZE select jb_op_cast(j) from jb_array_test; -- ~45000
EXPLAIN ANALYZE select jb_func_cast(j) from jb_array_test;  -- ~37000
-- Conclusion: about average time json_array_elements_text is ~20%  faster.
-- calculated as 200*(45000.-37000)/(45000+37000)

For both, JSON and JSONb, the performance difference is in the order of 20%, so in general (e.g. report or microservice output) it is negligible.

As expected JSON cast to text is faster than JSONB cast, because JSON is internally text and JSONB not.


PS: using PostgreSQL 12.4 on Ubuntu 20 LTS, virtual machine.

Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
2
CREATE or replace FUNCTION json_to_array(json) RETURNS text[] AS $f$
  SELECT coalesce(array_agg(x), 
    CASE WHEN $1 is null THEN null ELSE ARRAY[]::text[] END)
  FROM json_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

Test cases:

  • select json_to_array('["abc"]') => one element array
  • select json_to_array('[]') => an empty array
  • select json_to_array(null) => null
Tom Yeh
  • 1,987
  • 2
  • 15
  • 23
1

Oto's answer was a lifesaver, but it did have one boundary case that had me racking my brain. Due to the lossy nature of the cast, it works perfectly except in the case where you've got an empty json array. In that case you would expect an empty array to be returned, but it actually returns nothing. As a workaround, if you just concatenate the return value with an empty array it will have no affect in cases where there is actually a return, but do the right thing when you've got an empty array. Here's the updated SQL functions (for both json and jsonb) that implement the workaround.

CREATE or replace FUNCTION json_array_casttext(json) RETURNS text[] AS $f$
    SELECT array_agg(x) || ARRAY[]::text[] FROM json_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

CREATE or replace FUNCTION jsonb_array_casttext(jsonb) RETURNS text[] AS $f$
    SELECT array_agg(x) || ARRAY[]::text[] FROM jsonb_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;

There are a few peculiarities like this one that point to the rough edges at integrating a document database into a mature relational one, but Postgres does an admirable job at handling most of them.

Joel B
  • 12,082
  • 10
  • 61
  • 69
  • Hi @JoelB, good enhance of Oto's solution, for empty case. Your solution is elegant, perhaps the best... But perhaps (need to test perfomance!) a faster solution is `SELECT CASE WHEN $1='[]'::jsonb THEN array[]::text[] ELSE (SELECT array_agg(x) FROM etc) END`. The same for *json* but using `$1::text='{}'::text`. – Peter Krauss Dec 28 '17 at 19:17
  • Oh yeah, it's a total hack and I'm sure your suggestion would be more performant. My main concern was not _silently_ losing data, which this prevents. – Joel B Dec 28 '17 at 19:20
  • 2
    I think `coalesce(array_agg(x), array[]::text[])` would be "clearer" (at least to me) than the concatenation. –  Jul 10 '18 at 14:35
  • @a_horse_with_no_name you're exactly right. I keep finding how Postgres has something builtin for every case I could think of. Feel free to edit/update the post with your suggestion. – Joel B Jul 15 '18 at 13:00
1

The optimal conversion function for Postgres 9.6 or later is this:

CREATE OR REPLACE FUNCTION jsonb_array_to_text_array(_js jsonb)
  RETURNS text[]
  LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS
'SELECT ARRAY(SELECT jsonb_array_elements_text(_js))';

The ARRAY constructor being cheaper than array_agg(). See:

It's also kind of important to mark it as PARALLEL SAFE.

See the step-by-step explanation in this earlier answer on dba.SE:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

In my case it helped to have the result reflect 3 states, null, empty text array and non-empty text array depending on the input. Hopefully this will be useful to someone.

CREATE OR REPLACE FUNCTION json_array_text_array(JSON)
  RETURNS TEXT [] AS $$
DECLARE
  result TEXT [];
BEGIN
  IF $1 ISNULL
  THEN
    result := NULL;
  ELSEIF json_array_length($1) = 0
    THEN
      result := ARRAY [] :: TEXT [];
  ELSE
    SELECT array_agg(x) FROM json_array_elements_text($1) t(x) INTO result;
  END IF;
  RETURN result;
END;
$$
LANGUAGE plpgsql
IMMUTABLE
STRICT;
Pawel Zieminski
  • 439
  • 3
  • 8
  • 1
    Hi Pawel, in general we prefer `SQL language` tham PLpgSQL because SQL can be faster, mainlly in [inline optiomizations of immutable functions](https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions) see also [this discussion](https://dba.stackexchange.com/a/8189/90651). We also most avoid the use of STRICT, as a [PostgreSQL bug in the optimization planner](https://stackoverflow.com/questions/56119704/other-way-to-cast-varbit-to-int-and-bigint) – Peter Krauss Nov 01 '20 at 12:17
0

table

CREATE TABLE IF NOT EXISTS public.j
(
    id integer,
    data jsonb
)

data

most simple solution I found

SELECT j.id, array_agg(d.elem::int2) AS ar_values
FROM j
CROSS JOIN LATERAL jsonb_array_elements(j.data) k
CROSS JOIN LATERAL jsonb_array_elements_text(k->'ar') AS d(elem)
GROUP BY j.id;

can also be simplified

SELECT j.id, array_agg(d.elem::int2) AS ar_values
FROM j
, LATERAL jsonb_array_elements(j.data) k
, LATERAL jsonb_array_elements_text(k->'ar') AS d(elem)
GROUP BY j.id;
Rami Jamleh
  • 1,819
  • 1
  • 13
  • 10