I have a Dataframe, simlified as follows.
ID | year | y | A1 | A2 | ... | A52 | B1 | ...
1 1980 | 10| 1 | 2 | ... | 52 | 53 |
1 1981 | 20| 1 | 2 | ... | 52 | 53 |
1 1982 | 30| 1 | 2 | ... | 52 | 53 |
2 1980
2 1981
2 1982
...
1000
I need to reorganize it into:
ID | 1982_y | A1(from 1980) | ... | A52 | A1(from 1981) | ... | A1(from 1982)| ... | A52 |
1 | 30 | 1 | 2 | 3 | 4 | ...
2
3
4
...
1000
where I need to concatenate all the values from A1
to A52
for each year of the same location. Only the last y
of each ID
is kept as 1982_y
. Other values are discarded. By the end of the operation, there should be one row for each location, so 1000 rows.
For example, we will merge 1980, 1981, 1982 for ID=1 into one row. The last value of y, 30 is kept as 1982_y
. Then we concatenate the rest of the values for all 3 years into one line. If we have A B C, 52 for each, then we have 52x3 = 156 values.
I'm thinking I should iterate over the rows first, and then iterate over the columns, store A1-A52
and y
of last year to a Datafame. Append this to the dataframe for each iteration. I'm not sure how to carry out this operation in pandas. Many thanks!