1

Seems a bug, because is impossible to call pg_read_file or pg_read_binary_file without the length before the missing_ok flag... It is possible? I try lengths NULL and -1, but not works.

No clues at documentation, https://www.postgresql.org/docs/current/functions-admin.html


Notes

  • the length (file size in bytes) is unpredictable, sometimes big big files.
  • I need portable calls: for example pg_read_file(f,0,922337203,missing_ok) is not a solution.
  • (edited after @LaurenzAlbe comment) As a constant/default length I try also 9223372036854775807, and same problem, an error.
    In this case "ERROR: requested length too large".
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304

1 Answers1

1

That is not a bug, because it works as documented. But I understand that it bugs you.

I read the code, and there is no way to omit the length or offset if you want to specify missing_ok. You could write a patch for PostgreSQL that allows that; it could be a version of the function with two arguments.

But it is easy to write your own function:

CREATE FUNCTION read_file(filename text, missing_ok boolean) RETURNS text
   LANGUAGE plpgsql STRICT AS
$$DECLARE
   v_result bytea := '';
   v_chunk bytea;
   v_offset bigint := 0;
BEGIN
   LOOP
      v_chunk = pg_read_binary_file(filename, v_offset, 10485760, missing_ok);
      EXIT WHEN v_chunk = '';
      v_result := v_result || v_chunk;
      v_offset := v_offset + length(v_chunk);
   END LOOP;
   RETURN convert_from(v_result, current_setting('server_encoding'));
END;$$;

The function reads the file in chunks of 10MB. The result is collected as bytea first to avoid errors if a chunk ends in the middle of a multi-byte character.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • "Space-time portability": what is Big Data today, tomorrow will be small information; and what I need on my local machine may not be portable in a large company or elsewhere. My library source-code should be generic and kept in the same *git*, for all places and all time. – Peter Krauss May 14 '21 at 18:42
  • Yes, it is a bug, but okay, it **is a specification bug**... It is as serious as a *software bug* — because user perception is the reference for any good specification... Microsoft, Oracle and other competitors will do better, and we will lose PostgreSQL for nonsense. Here on Stackoverflow, unfortunately, it is **the only place** where the *community of real users* (not developers) can, without barriers, complain about bugs and **vote** for improvements. – Peter Krauss May 14 '21 at 19:13
  • Thanks for the clue, I try 9223372036854775807 and a little less, 922337203685477, but the limit is perhaps my hardware memory. *ERROR: requested length too large*. PS: about Microsoft, Oracle, even MySQL, **I hate** (!), but the problem is to convince the team (or leadership) and in colonized countries, like Brazil. I need "perfect PostgreSQL" to preserve my infrastructure at work. – Peter Krauss May 14 '21 at 21:02
  • Oh! I understand your problem now. You cannot specify a large limit, since PostgreSQL will go out of memory. How about writing your own function that reads the file in chunks? I'll try to improve my answer with a suggestion. – Laurenz Albe May 16 '21 at 05:31
  • I have added a function that you can use to work around the limitation. – Laurenz Albe May 16 '21 at 15:33