Flattening or pivoting a table with subjects/tasks from a study with PostgreSQL

Question

I have a large table of data in a PostgreSQL database from a research study where a number of test subjects performed a number of tasks, and several sensors gathered data while they performed these tasks. I need to create a table with only one row for each subject, and columns corresponding to the sensor measurements for each task. A simplified example of what I am trying to achieve:

A couple of key issues:

I have already aggregated the data I need so there is only one row for each subject/task combination.
Not all subjects are guaranteed to have completed every task. If a subject has not completed a task, there is no corresponding subject/task row.
The output needs to have a column corresponding to every possible combination of sensor and task.
The actual table I am working with has 37 columns of sensor data, 24 tasks, and each task was performed twice in a pre and post session. Thus the number of columns I need is 37x24x2=1776 columns of data!

I could create a new view for every task and then do an outer join on the subject column, but that would still require manually creating a separate view for all 24 tasks and specifying names for all 1776 columns.

How do I build a query which automatically creates the necessary columns?

Create the columns, but in what kind of object? A table cannot have 1776 columns unless recompiling with larger pages. See [What is the maximum number of columns in a PostgreSQL select query](https://stackoverflow.com/questions/12606842) — Daniel Vérité, Oct 05 '17 at 10:57

score 1 · Accepted Answer · answered Oct 03 '17 at 20:21

I think you have a few particular problems here:

Queries need to know columns at the outset for planning purposes.
You want to do a fairly massive pivot.

I suspect that tablefunc's crosstab function will help a lot (or could be hacked to do what you want if necessary since it was written to be an example of a C-language table valued function). However you would still have to generate the crosstab criteria outside the query and run it.

This leads to two options for doing this:

Create the query outside the db and feed it in with the massive cross tab or
Create the cross tab inside the database in a pl/pgsql stored procedure and return a refcursor.

I suspect in most cases, the first is better.

How would I generate the cross tab? Would I want to use string manipulation in something like python? Are there better ways to go about it? — mooglinux, Oct 03 '17 at 22:18

score 1 · Answer 2 · answered Oct 08 '17 at 08:29

1

You can increase postgresql column count. The info at the following link :

https://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux

Additional, json type columns may be a good alternative for large column set.

answered Oct 08 '17 at 08:29

ugurlu2001

41
5

Could you elaborate a bit about using son column types? – mooglinux Oct 08 '17 at 20:33

score 0 · Answer 3 · answered Oct 08 '17 at 20:51

0

Postgresql is very well documented, please check the postgresql help on Web site or download pdf document.

https://www.postgresql.org/docs/current/static/datatype-json.html

answered Oct 08 '17 at 20:51

ugurlu2001

41
5

What I mean was, how would JSON column types help me out with this particular problem? – mooglinux Oct 09 '17 at 00:33

Flattening or pivoting a table with subjects/tasks from a study with PostgreSQL

3 Answers3