0

I have a large table of data in a PostgreSQL database from a research study where a number of test subjects performed a number of tasks, and several sensors gathered data while they performed these tasks. I need to create a table with only one row for each subject, and columns corresponding to the sensor measurements for each task. A simplified example of what I am trying to achieve: Sample data and sample output

A couple of key issues:

  • I have already aggregated the data I need so there is only one row for each subject/task combination.
  • Not all subjects are guaranteed to have completed every task. If a subject has not completed a task, there is no corresponding subject/task row.
  • The output needs to have a column corresponding to every possible combination of sensor and task.
  • The actual table I am working with has 37 columns of sensor data, 24 tasks, and each task was performed twice in a pre and post session. Thus the number of columns I need is 37x24x2=1776 columns of data!

I could create a new view for every task and then do an outer join on the subject column, but that would still require manually creating a separate view for all 24 tasks and specifying names for all 1776 columns.

How do I build a query which automatically creates the necessary columns?

mooglinux
  • 815
  • 1
  • 11
  • 27
  • Create the columns, but in what kind of object? A table cannot have 1776 columns unless recompiling with larger pages. See [What is the maximum number of columns in a PostgreSQL select query](https://stackoverflow.com/questions/12606842) – Daniel Vérité Oct 05 '17 at 10:57

3 Answers3

1

I think you have a few particular problems here:

  1. Queries need to know columns at the outset for planning purposes.
  2. You want to do a fairly massive pivot.

I suspect that tablefunc's crosstab function will help a lot (or could be hacked to do what you want if necessary since it was written to be an example of a C-language table valued function). However you would still have to generate the crosstab criteria outside the query and run it.

This leads to two options for doing this:

  1. Create the query outside the db and feed it in with the massive cross tab or
  2. Create the cross tab inside the database in a pl/pgsql stored procedure and return a refcursor.

I suspect in most cases, the first is better.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182
  • How would I generate the cross tab? Would I want to use string manipulation in something like python? Are there better ways to go about it? – mooglinux Oct 03 '17 at 22:18
1

You can increase postgresql column count. The info at the following link :

https://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux

Additional, json type columns may be a good alternative for large column set.

ugurlu2001
  • 41
  • 5
0

Postgresql is very well documented, please check the postgresql help on Web site or download pdf document.

https://www.postgresql.org/docs/current/static/datatype-json.html

ugurlu2001
  • 41
  • 5