1

I need a solution to the following problem with a clear PostgreSQL 9.4:

  • Read a zipped file from server into a bytea column
  • No extensions allowed
  • Superuser allowed
  • Postgres user has permission to read the file
  • Can be a user function

EDIT:

The file is outside cluster path, so normal functions raise:
SQL Error: ERROR: absolute path not allowed

Luiz Vaz
  • 1,669
  • 1
  • 19
  • 32

3 Answers3

1

Here is a simple function get_file_contents(filename text) returns bytea for the job.

create or replace function get_file_contents(filename text) returns bytea as
$fn$
 declare 
    lo_oid oid;
    retval bytea;
 begin
    lo_oid := lo_import(filename);
    retval := lo_get(lo_oid);
    perform lo_unlink(lo_oid);
    return retval;
 end;
$fn$ language plpgsql;
  • Trivial usage
-- Read the great work of Sun Tzu
select get_file_contents('/media/data/ForeignData/The Art Of War.pdf');

-- Insert into a table, update a table
insert into mytable (mycolumn[,<others>]) values (get_file_contents(myfilename)[,<others>]);
update mytable set mycolumn = get_file_contents(myfilename) where <whatever there>;
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
1

Use the built-in function pg_read_binary_file(). It's available since Postgres 9.1 and does exactly what you want. The manual:

Returns all or part of a file. This function is identical to pg_read_file except that it can read arbitrary binary data, returning the result as bytea not text; accordingly, no encoding checks are performed.

This function is restricted to superusers by default, but other users can be granted EXECUTE to run the function.

So to ...

Read a zipped file from server into a bytea column

UPDATE custom_table
SET    content = pg_read_binary_file('/tmp/28528026bc302546d17ce7e82400ab7e.zip')
WHERE  id = 123;

Considerably faster than any workaround.

Note this restriction, quoting the manual:

Only files within the database cluster directory and the log_directory can be accessed. Use a relative path for files in the cluster directory, and a path matching the log_directory configuration setting for log files.

You can overcome the path restriction with a symlink from your db directory to any other directory. Be wary of possible security implications, though. See:

Also, consider upgrading to a current version of Postgres Postgres 9.4 has reached EOL on February 13, 2020.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Yep, I tried. But for some unknown reason the field is not loaded with the file contents... – Luiz Vaz Jan 17 '21 at 00:29
  • I don't own the database, it's belongs some Gov. Department. By the away this request was filled. – Luiz Vaz Jan 17 '21 at 00:40
  • nope. The field just remains empty using normal methods. The most strange part is that the contents are shown using simple Select. But fails when used with Insert or Update. I can't put the database into deep trace mode, because its accessed by more than 40.000 users daily. – Luiz Vaz Jan 17 '21 at 00:49
  • 1
    You are aware of this bit from the manual? `Only files within the database cluster directory and the log_directory can be accessed. Use a relative path for files in the cluster directory, and a path matching the log_directory configuration setting for log files.` Also, upgrading the Postgres version might help. – Erwin Brandstetter Jan 17 '21 at 01:06
  • Generally, if it works in a `SELECT`, it should also work in an `INSERT` or `UPDATE`. That sounds *very* odd. To be sure: use `pg_read_binary_file()`, not `pg_read_file()`. – Erwin Brandstetter Jan 17 '21 at 01:09
0

After a lot o research, I came out with the following function:

  CREATE OR REPLACE FUNCTION file_read(file text)  
  RETURNS bytea AS $$
    DECLARE
      content text;
      tmp text;
    BEGIN
      file := quote_literal(file);
      tmp := quote_ident(md5(random()::text));

      -- create tmp table using random name
      EXECUTE 'CREATE TEMP TABLE ' || tmp || ' (id oid, file_name text, content bytea)';
      
      -- add given filename
      EXECUTE 'INSERT INTO '|| tmp ||' (file_name) VALUES('|| file ||')';
            
      -- add the document to large object storage and return the link id
      BEGIN
           EXECUTE 'UPDATE ' || tmp || ' SET id = lo_import(file_name) ';
      EXCEPTION WHEN OTHERS THEN
           RETURN NULL;
      END;
            
      -- pull document from large object storage
      EXECUTE 'UPDATE ' || tmp || ' SET content = lo_get(id) ';
      
      -- delete the file from large object storage
      EXECUTE 'SELECT lo_unlink(id) FROM ' || tmp;
      
      -- save data to content variable
      EXECUTE 'SELECT content FROM ' || tmp INTO content;
      
      -- drop tmp table      
      EXECUTE 'DROP TABLE ' || tmp;

      -- return 
      RETURN content;
    END;
  $$ LANGUAGE plpgsql VOLATILE;

Sample use case :

Read from file
select file_read(concat('/tmp/', '28528026bc302546d17ce7e82400ab7e.zip')

Update column
update custom_table set content = file_read(filename)

Luiz Vaz
  • 1,669
  • 1
  • 19
  • 32