2

I want to generate a dataframe with pandas read_sql from my sqlalchemy query with a PostgreSQL's jsonb attribute to columns.

Actually this will give me my answer:

query = session.query(
    cls.id,
    cls._my_jsonb_column
).all()
pd.DataFrame.from_dict([dict(id=id_, **i) for id_,i in query])

But I would prefer to unpack the jsonb with PostgreSQL instead of in the application.

My attempt gives

query = session.query(
    cls.id,
    func.jsonb_to_record(cls._my_jsonb_column)
)
pd.read_sql(query.statement, query.session.bind)

(psycopg2.NotSupportedError) function returning record called in context that cannot accept type record

Joost Döbken
  • 3,450
  • 2
  • 35
  • 79

1 Answers1

0

json_to_record (and jsonb_to_recordset) returns record as if it was the result of a SELECT query. In sqlalchemy context, it provides a selectable which can be used like a table.

So you should consider the result of func.jsonb_to_record(cls._my_jsonb_column) as a sort of table you can join to you your original table.

That is you query should look like:

jsonb_data = func.jsonb_to_record(cls._my_jsonb_column)
query = session.query(
    select(
        [cls.id, <other columns>]
    ).select_from(
        cls.join(jsonb_data, <on_clause>)
    )
)

You could even flatten your JSON data using JSON processing functions but it is not possible to be more precise without knowing the structure of the JSON data.

Alternatively, I recently released a package to easily flatten JSONB fields from a description of the json data, I would be happy to have some feedbacks: pg_jsonb_flattener

Tryph
  • 5,946
  • 28
  • 49