11

Sorry for the basic question.

I have a table with the following columns.

      Column |  Type   | Modifiers 
     --------+---------+-----------
      id     | integer | 
      doc_id | bigint  | 
      text   | text    | 
  • I am trying to do text matching on the 'text' (3rd column)
  • I receive an error message when I try to text match on the text column. Saying that the string is too long for ts_vector.

I only want observations which contain the words "other events"

    SELECT * FROM eightks\d
    WHERE to_tsvector(text) @@ to_tsquery('other_events')

I know that there are limitation to the length of the ts_vector.

Error Message

   ERROR:  string is too long for tsvector (2368732 bytes, max 1048575 bytes)

How do I convert the text column into a ts_vector and will this resolve my size limit problem?Alternatively, How do I exclude observations over the maximum size?

Postgres version 9.3.5.0

Here is the reference to the limit limit

Thanks

CISCO
  • 539
  • 1
  • 4
  • 14
  • What is the exact text of the error message? (**Always** include exact error message text, and PostgreSQL version, in your questions, along with the exact SQL run). – Craig Ringer Sep 08 '14 at 02:03
  • All done @craigringer. Sorry! – CISCO Sep 08 '14 at 02:10
  • 1
    ... and the PostgreSQL version? Anyway: I wasn't aware `tsvector` had a size limit. Surprised me. – Craig Ringer Sep 08 '14 at 02:12
  • @CraigRinger ... eventually I will get it ;-) – CISCO Sep 08 '14 at 02:27
  • 4
    @wazza2013 did you get to the bottom of this issue? I'm facing the same problem here. – 0bserver07 May 26 '15 at 22:07
  • How is this question different from your other one you asked one day earlier? https://stackoverflow.com/questions/25715721/postgresql-selecting-observations-and-putting-in-new-table – maxschlepzig Aug 10 '19 at 08:05
  • 1
    Possible duplicate of [Postgresql - selecting observations and putting in new table](https://stackoverflow.com/questions/25715721/postgresql-selecting-observations-and-putting-in-new-table) – maxschlepzig Aug 10 '19 at 15:48
  • The [tsvector2](https://github.com/postgrespro/tsvector2) extension might be a possible solution. – Mark G Apr 16 '21 at 23:46

1 Answers1

0

One way to deal ts_vector size limitations is to split the documents in the text column into smaller chunks for searching, then merge the segments. After breaking the column you can store the chunks into separate rows or columns.

Another way to work around the ts_vector size limitations is to call to_tsvector() with a truncated text value. This can be achieved by creating a trigger function that sets the ts_vector column to a truncated text value.

CREATE FUNCTION tsv_trigger_fn() RETURNS 
trigger AS $$
begin
  new.tsv_text := to_tsvector('english', left(new.text, 1*1024*1024));
  return new;
end
$$ LANGUAGE plpgsql;

CREATE TRIGGER tsv_trigger BEFORE INSERT OR UPDATE
  ON <your_table> FOR EACH ROW EXECUTE FUNCTION tsv_trigger_fn();