0

I am wondering if it is possible to extract the column name, data type, and one sample value from a database table with a single PostgreSQL query. I'm aiming to do this for all columns of one table.

Feels like a variable is needed for the column name so you can use it when querying the table for the sample value but Postgres doesn't support this in plain SQL statement (How to declare a variable in a PostgreSQL query).

I can achieve this through hard-coding a single value but any suggestions on whether this is possible to do for each column of the table (join each one using its name in a select statement to obtain the single sample value)?

column   | data_type | sample_val
--------------------------------
foo_col1 |   text    | NULL
foo_col2 |   text    | 'foo_val2'
foo_col3 |   text    | NULL
select column_name as column, data_type, sample_val
from information_schema.columns t1
join pg_class t2 on (t1.table_name = t2.relname)
left outer join pg_description t3 on (t2.oid = t3.objoid and t3.objsubid = t1.ordinal_position)

left outer join (select CAST('foo_col2' AS text) as foo_col2, foo_col2 as sample_val from foo_schema.foo_table limit 1) n2
on (n2.foo_col2 = column_name)

where table_schema = 'foo_schema'
and table_name = 'foo_table'
order by ordinal_position
Andrew Goss
  • 45
  • 1
  • 6

1 Answers1

0

You can use a variation of row count for all tables for this:

select c.table_schema, c.table_name,
       c.column_name,
       c.data_type,
       (xpath('/table/row/'||column_name||'/text()', 
          query_to_xml(format('select %I 
                               from %I.%I limit 1', c.column_name, c.table_schema, c.table_name), true, false, '')))[1]::text as sample_value
from information_schema.columns c 
where table_schema = 'foo_schema';

query_to_xml() will run the query and format the result as XML. The xpath() function then extracts that column value from the XML.

This is quite expensive as the query is run once for each column, not once for each table. Note that the sample values might not be from the same row.

You can optimize this by just running one query per table and then joining that result back to the columns:

with samples as (
  select table_schema, 
         table_name, 
         query_to_xml(format('select * from %I.%I limit 1', table_schema, table_name), true, false, '') as sample_row
  from information_schema.tables 
  where table_schema = 'foo_schema'
)
select c.table_schema, c.table_name,
       c.column_name,
       c.data_type,
       (xpath('/table/row/'||column_name||'/text()', s.sample_row))[1]::text as sample_value
from information_schema.columns c 
  join samples s on (s.table_schema, s.table_name) = (c.table_schema, c.table_name);

With the above, all sample values are from the same row.

  • Thanks for the idea! Definitely on the right path here, but there is an error parsing the XML with the xpath() function for the statement you provided. Trying to figure it out. Using this version of Postgres: `PostgreSQL 9.3.16` – Andrew Goss Mar 10 '18 at 15:09
  • Can definitely pinpoint the issue to xpath() function call but stuck on what is wrong with it. This query gives the XML for the sample_value: `select c.table_schema, c.table_name, c.column_name, c.data_type, query_to_xml(format('select %I from %I.%I limit 1', c.column_name, c.table_schema, c.table_name), true, true, '')::text as sample_value from information_schema.columns c where table_schema = 'foo_schema'; " 1519598724863690 "` – Andrew Goss Mar 10 '18 at 15:22
  • Hard-coding works: `with samples as ( select table_schema, table_name, query_to_xml(format('select * from %I.%I limit 1', table_schema, table_name), true, true, '') as sample_row from information_schema.tables where table_schema = 'foo_schema' ) select c.table_schema, c.table_name, c.column_name, c.data_type, (xpath('/row/foo_col/text()', s.sample_row))[1]::text as sample_value from information_schema.columns c join samples s on (s.table_schema, s.table_name) = (c.table_schema, c.table_name) where column_name = 'foo_col';` – Andrew Goss Mar 10 '18 at 20:07
  • @AndrewGoss: this works for me with 9.3 but only with tables that contain at least one row. If the result of `query_to_xml` is null then I get an error "could not parse XML document". –  Mar 12 '18 at 09:14