1

All the following uses of pg_read_file() are running fine:

select pg_read_file('myFile.txt'); -- ok full file
select pg_read_file('myFile-notExists',0,10,true); -- NULL is Ok! 

But these are problematic:

select pg_read_file('myFile-notExists.txt'); -- ERROR... It is not what I need
select pg_read_file('myFile.txt',0,10,true); -- ok not null, but need full file!

... I try to solve the problem with a wrap function:

CREATE or replace FUNCTION pg_read_file(f text, missing_ok boolean) RETURNS text AS $$
  SELECT pg_read_file(f,0,922337203,missing_ok) 
   -- the max 9223372036854775807 is  too large, also 9223372036854775800, ....
$$ LANGUAGE SQL IMMUTABLE;

But it say "ERROR: requested length too large" for tests on maximum length... So I using aleatory constant. Question1: what the "max file size" value?

select pg_read_file('myFile-notExists.txt',true); -- Ok! NULL is what I need
select pg_read_file('myFile.txt',true); -- Ok! full file.

Question2: is there another way to solve this problem?

At standard PostgreSQL v12 \df pg_read_file says:

   Schema   |     Name     | Result data type |      Argument data types      | Type
------------+--------------+------------------+-------------------------------+------
 pg_catalog | pg_read_file | text             | text                          | func
 pg_catalog | pg_read_file | text             | text, bigint, bigint          | func
 pg_catalog | pg_read_file | text             | text, bigint, bigint, boolean | func

Why pg_read_file(text,boolean) is not in the pg_catalog? It will produce some inconsistence with other modules or extensions?

Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
  • What do you want to happen if the size of the file is larger than the maximum size of a PostgreSQL value? – jjanes Aug 07 '20 at 13:04
  • Hi @jjanes, a file *"larger than the maximum size"* is a big problem, we must avoid it... But the *length* parameter (second bigint) enforces a limit. Ideal is "free size" or the maximum of the filesystem. – Peter Krauss Aug 07 '20 at 13:09

2 Answers2

1

How about using pg_stat_file to test its existence?

CREATE or replace FUNCTION pg_read_file(f text, missing_ok boolean) RETURNS text AS $$
  SELECT case 
     when pg_stat_file(f,missing_ok) is NULL then NULL 
     else pg_read_file(f) end
$$ LANGUAGE SQL IMMUTABLE;

There is a race condition here that the file could disappear after the pg_stat_file but before the pg_read_file, leading to an error rather than NULL return.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • Thanks, you solved but, besides that race condition, the CPU waste on `pg_stat_file()` work... **Seems a bug** on `pg_read_file` function. – Peter Krauss Aug 07 '20 at 13:19
1

This is a Wiki, please EDIT to enhance this answer!

PS: check also this discussion, performance questions and this alternative.


It is a pg_read_file() bug... And for a workaround, no really good solution without race condition (remembered before by @jjanes). Real-time applications need some caution.

Here a change of approach, to retrieve pg_stat_file's information, avoiding loss of CPU.

It's a matter of taste, I think RECORD is ugly to work with, I prefer JSONb.

Imagine a cenario where you are working with JSONb, for example building back-end with PostgREST...

Reusing pg_stat_file

CREATE or replace FUNCTION jsonb_read_stat_file(
  f text,
  missing_ok boolean DEFAULT false
) RETURNS JSONb AS $f$
  SELECT j || jsonb_build_object( 'file',f,  'content',pg_read_file(f) )
  FROM to_jsonb( pg_stat_file(f,missing_ok) ) t(j)
  WHERE j IS NOT NULL
$f$ LANGUAGE SQL IMMUTABLE;

To to get only content, emulating pure read_file(x,true), you can use jsonb_read_stat_file('/tmp/text.txt',true)->>'content'.


Example of use:

echo "Hello world!" > /tmp/text.txt
psql mydb -c "SELECT jsonb_read_stat_file('/tmp/text-ERROR.txt',true) IS NULL" # true
psql mydb -c "SELECT jsonb_read_stat_file('/tmp/text.txt',true)"
 {
   "file": "/tmp/text.txt", 
   "size": 13, 
   "isdir": false, 
   "access": "2020-08-09T14:44:28+00:00",
   "change": "2020-08-09T14:44:28+00:00", 
   "content": "Hello world!\n", 
   "creation": null, 
   "modification": "2020-08-09T14:44:28+00:00"
}
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304