0

I have the following model of a blog post:

title = db.Column(db.String())
content = db.Column(db.String())
tags = db.Column(ARRAY(db.String))

Tags field can be an empty list.

Now I want to select all distinct tags from the database entries with max performance - excluding empty arrays.

So, say I have 3 records with the following values of the tags field:

  • ['database', 'server', 'connection']
  • []
  • ['connection', 'security']

The result would be ['database', 'server', 'connection', 'security']

The actual order is not important.

mimic
  • 4,897
  • 7
  • 54
  • 93

2 Answers2

0

The distinct() method should still work fine with array columns.

from sqlalchemy import func

unique_vals = BlogPost.query(func.unnest(BlogPost.tags)).distinct().all()

https://docs.sqlalchemy.org/en/13/orm/query.html?highlight=distinct#sqlalchemy.orm.query.Query.distinct

This would be identical to running a query in postgres:

SELECT DISTINCT unnest(tags) FROM blog_posts
Holden Rehg
  • 917
  • 5
  • 10
0

If you can process the results after(usually you can) and don't want to use a nested query for this sort of thing, I usually resort to doing something like;

func.array_agg(func.array_to_string(BlogPost.tags, "||")).label("tag_lists")

and then split on the join string(||) after.

karuhanga
  • 3,010
  • 1
  • 27
  • 30