Sample Data:
id | start_to_end_date | attribute | val |
---|---|---|---|
1395287 | 2019-01-01 - 2019-12-31 | CostOfGoodsAndServicesSold | 49.369 |
1395288 | 2020-01-01 - 2020-12-31 | CostOfGoodsAndServicesSold | 75.476 |
1395289 | 2020-04-01 - 2020-06-30 | CostOfGoodsAndServicesSold | 17.69 |
1395290 | 2020-07-01 - 2020-09-30 | CostOfGoodsAndServicesSold | 19.056 |
1395291 | 2021-01-01 - 2021-12-31 | CostOfGoodsAndServicesSold | 137.292 |
If some of you are familiar with financial statements and Index Match in Excel, I would basically like to recreate what an Income Statement looks like.
I have all my unique column_names in a list. This current order is important:
column_names = ['2020-04-01 - 2020-06-30',
'2020-07-01 - 2020-09-30',
'2021-01-01 - 2021-03-31',
'2021-04-01 - 2021-06-30',
'2021-07-01 - 2021-09-30',
'2022-01-01 - 2022-03-31',
'2019-01-01 - 2019-12-31',
'2020-01-01 - 2020-12-31',
'2021-01-01 - 2021-12-31']
I also have all my unique row names in a list. This current order is important:
row_names = ['RevenueFromContractWithCustomerExcludingAssessedTax',
'CostOfGoodsAndServicesSold',
'GrossProfit',
'ResearchAndDevelopmentExpense']
The final dataframe would have the above column_names
going across, the above row_names
going down (as the index).
The values that would fill in dataframe would be from df['val']
, retrieved by matching df['attribute'] == column_names[i]
and df['start_to_end_date'] == row_names[i]
.
How would I do this?
- I tried to create an empty Dataframe first with the 2 lists but there is no place to query for
df['val']
:
pd.DataFrame(row_names , columns=column_names, index=[i[0] for i in row_names ])
- I'm imagining something in the end would look like:
df_new = np.where((df['attribute'] == column_names) & (df['start_to_end_date'] == row_names), df['val'], None)
.
Ideal End result:
2020-04-01 - 2020-06-30 | 2020-07-01 - 2020-09-30 | 2021-01-01 - 2021-03-31 | 2021-04-01 - 2021-06-30 | |
---|---|---|---|---|
RevenueFromContractWithCustomerExcludingAssessedTax | ##.## from df[val] | ##.## | ##.## | ##.## |
CostOfGoodsAndServicesSold | 17.69 | 19.056 | ##.## | ##.## |
GrossProfit | ##.## from df[val] | ##.## | ##.## | ##.## |