0

I have a jsonb structure on postgres named data where each row (there are around 3 million of them) looks like this:

[
    {
        "number": 100,
        "key": "this-is-your-key",
        "listr": "20 Purple block, THE-CITY, Columbia",
        "realcode": "LA40",
        "ainfo": {
            "city": "THE-CITY",
            "county": "Columbia",
            "street": "20 Purple block",
            "var_1": ""
        },
        "booleanval": true,
        "min_address": "20 Purple block, THE-CITY, Columbia LA40"
    },
    .....
]

I would like to query the min_address field in the fastest possible way. In Django I tried to use:

APModel.objects.filter(data__0__min_address__icontains=search_term)

but this takes ages to complete (also, "THE-CITY" is in uppercase, so, I have to use icontains here. I tried dropping to rawsql like so:

cursor.execute("""\
    SELECT * FROM "apmodel_ap_model" 
    WHERE ("apmodel_ap_model"."data" 
    #>> array['0', 'min_address'])
    @> %s \
    """,\
    [json.dumps([{'min_address': search_term}])]
)

but this throws me strange errors like:

LINE 4:       @> '[{"min_address": "some lane"}]'       
              ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

I am wondering what is the fastest way I can query the field min_address by using rawsql cursors.

cezar
  • 11,616
  • 6
  • 48
  • 84
AJW
  • 5,569
  • 10
  • 44
  • 57
  • The field `data` contains a JSON, which is in fact an array with plenty of objects. Do you really want to search for the `min_address` only in the first object in the array (`data__0__min_address`)? – cezar Mar 14 '18 at 09:00
  • @cezar: you are correct - I would like to search ALL the objects in the array. How do I do this? This is why I wanted to do a rawSQL search. – AJW Mar 14 '18 at 12:42

1 Answers1

0

Late answer, probably it won't help OP anymore. Also I'm not at all an expert in Postgres/JSONB, so this might be a terrible idea.

Given this setup;

so49263641=# \d apmodel_ap_model;
         Table "public.apmodel_ap_model"
 Column | Type  | Collation | Nullable | Default
--------+-------+-----------+----------+---------
 data   | jsonb |           |          |

so49263641=# select * from apmodel_ap_model ;
                                           data
-------------------------------------------------------------------------------------------
 [{"number": 1, "min_address": "Columbia"}, {"number": 2, "min_address": "colorado"}]
 [{"number": 3, "min_address": "  columbia "}, {"number": 4, "min_address": "California"}]
(2 rows)

The following query "expands" objects from data arrays to individual rows. Then it applies pattern matching to the min_address field.

so49263641=# SELECT element->'number' as number, element->'min_address' as min_address 
    FROM apmodel_ap_model ap, JSONB_ARRAY_ELEMENTS(ap.data) element 
    WHERE element->>'min_address' ILIKE '%col%';
 number |  min_address
--------+---------------
 1      | "Columbia"
 2      | "colorado"
 3      | "  columbia "
(3 rows)

However, I doubt it will perform well on large datasets as the min_address values are casted to text before pattern matching.

Edit: Some great advice here on indexing JSONB data for search https://stackoverflow.com/a/33028467/1284043

MarkM
  • 798
  • 6
  • 17