1

I try to select only lexemes from unnested ts_vector column:

select lexeme
from 
    (select unnest(to_tsvector('russian', description))
     from cards) as roots;

But it doesn't work, because SQL doesn't know anything about lexeme column. How can I select only lexemes from unnested ts_vectors?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Arli Chokoev
  • 521
  • 6
  • 22

2 Answers2

3

I found a laconic way to do it:

SELECT (unnest(to_tsvector(description))).lexeme
FROM cards
Arli Chokoev
  • 521
  • 6
  • 22
2

What you found yourself:

SELECT (unnest(to_tsvector(description))).lexeme
FROM   cards;

The equivalent standard SQL form with the set-returning function in the FROM list is slightly more verbose, but easier to integrate in bigger queries:

SELECT d.lexeme
FROM   cards c
LEFT   JOIN LATERAL unnest(to_tsvector(c.description))) d ON true;

Related:

Why? How?

Since Postgres 9.6 there is a second "overloaded" variant of unnest(). Quoting the release notes:

  • Add new functions for tsvector data (Stas Kelvich)

    The new functions are ts_delete(), ts_filter(), unnest(), tsvector_to_array(), array_to_tsvector(), and a variant of setweight() that sets the weight only for specified lexeme(s).

Bold emphasis mine.

See:

SELECT proname, proargtypes::regtype[], prorettype::regtype
FROM   pg_proc
where  proname = 'unnest';
proname | proargtypes      | prorettype
--------+------------------+-----------
unnest  | [0:0]={anyarray} | anyelement
unnest  | [0:0]={tsvector} | record    
(2 rows)

db<>fiddle here

The function is documented in the manual among text search functions:

unnest ( tsvector ) → setof record ( lexeme text, positions smallint[], weights text )

It returns setof record with named output columns. Hence we can refer to the column lexeme directly like we did.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228