0

With PowerPivot in Exycel I am able to

a) create a data model that consists of several connected tables (see example below)

b) create a pivot table based on that model

For display, the pivot table does not use id (=integer) values but the corresponding string values as row/column headers.

With pandas I could

a) load and join those related tables b) create a pivot table based on the joined table

pivot_table = pandas.pivot_table(
    joined_table,
    index=["scenario_name"], #entries to show as row headers
    columns='param_name',          #entries to show as column headers
    values='value',          #entries to aggregate and show as cells
    aggfunc=numpy.sum,       #aggregation function(s)
)

However, with huge tables, I would expect it to be more efficient, if the pivot_table operates on the non-joined data table and applies the string values only for result display.

=> Is there a convenient way to consider foreign key relations when using pandas DataFrame and pivot_table ?

I would expect something like

pivot_table = pandas.pivot_table(
    {"data": data_table, 
     "scenario": scenario_table, 
     "param": param_table
    },
    index=["scenario:name"], #entries to show as row headers
    columns="param:name",    #entries to show as column headers
    values="data:value",          #entries to aggregate and show as cells
    aggfunc=numpy.sum,       #aggregation function(s)
)

=> If not, are there some alternatives libraries to pandas that could handle related tables as source for pivot tables?

Small example table structure:


table "data"
id scenario_id param_id value
1  1           1        100
2  1           2        200

table "scenario"
id name
1  reference
2  best_case

table "param"
id name
1  solar
2  wind

scenario_id of table data points on id of table scenario

param_id of table data points on id of table param

Another example with some more columns:

enter image description here

Stefan
  • 10,010
  • 7
  • 61
  • 117
  • Pivot as usual and replace the values of index/column names? – Quang Hoang Sep 25 '20 at 11:19
  • Yes. I did not want to reinvent the wheel and hoped that there would already exist some solution. Otherwise I would have to manually adapt the multi-index and headers of the resulting data frame, looking up the meaning of the ids in the corresponding tables. https://stackoverflow.com/questions/29150346/pandas-modify-a-particular-level-of-multiindex – Stefan Sep 25 '20 at 14:07

0 Answers0